What I'm Reading

Hey,

  • This page is my personal archive of research papers I've read and found valuable.
  • Each entry includes the paper title, a link to the PDF/arXiv, and brief notes on why it mattered to me.
  • It's a living reading list – updated whenever I finish something noteworthy.
  • Think of it as a window into my research interests and learning journey.

  • 2025

    1. CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
      1. Many SQL drafts - AI makes lots of different SQL queries for one question using smart tricks.
      2. Pairwise pick - Another AI compares them two-by-two and chooses the best one (more accurate).
      3. Auto-fix errors - Grabs only needed data and fixes buggy queries using database feedback.
    2. Context Engineering: Sessions, Memory
      1. Context Engineering — Goes beyond basic prompts: smartly builds the full "world" an AI sees each turn (instructions, history, tools, memories) to make it act intelligently.
      2. Sessions — Short-term "desk" for one chat: holds active conversation, recent notes, and working state so the AI doesn't forget mid-talk.
      3. Memory — Long-term storage across chats: extracts key facts/preferences from sessions, saves them persistently, and loads them back for personalized, ongoing help.

    2024

    1. RAG-Fusion: a New Take on Retrieval-Augmented Generation
      1. RAG-Fusion enhances standard RAG by generating multiple related queries from the original one, then fusing retrieval results using Reciprocal Rank Fusion (RRF).
      2. Core advantage: Answers are more accurate and comprehensive because the model sees the question from several angles and prioritises consistently relevant documents.
      3. Main trade-off: ~1.8× slower than basic RAG due to extra LLM call and larger context.
    2. A Survey on LLM-as-a-Judge
      1. LLM-as-a-Judge means using a large language model (like ChatGPT) as an automatic judge to score or compare answers instead of humans.
      2. The paper organizes everything clearly: how these AI judges work, how to use them better (e.g., ask for pairwise comparisons or vote with several models), and how to fix their common mistakes (like favoring longer answers).

    Comments

    Popular posts from this blog

    Deploying AI Agents in Production Using Open Source Architecture

    Welcome to Pods and Prompts

    Semantic caching for LLM Applications and AI Agents