Text-to-SQL Course February Update

2025年3月1日 · 阅读需 2 分钟

In this post, I'll talk about three changes made to the implementation of Text-to-SQL course.

Use pgvector

The first change is that the local vector store is switched to pgvector. When working with Chroma, I encountered some issues when upgrading either Spring Boot version, or Chroma version. It's time-consuming to debug and fix these kind of issues. So I changed the vector store to pgvector. pgvector is stable and easy to use. We can also leverage existing database management tools.

Thanks to the VectorStore interface in Spring AI, it's very easy to switch to different vector stores. To use pgvector, we can simply include the Spring Boot starter for pgvector, and change some configurations. No code change is required. Docker compose file is also updated to use pgvector image.

Read Environment Variables

The second change is reading environment variables from an env file when deploying to aws lambda. In the current implementation, environment variables are loaded from the current machine. If we set an environment variable for aws lambda deployment, this environment variable will also be loaded by other applications running on the machine. So I changed to load those environment variables from an env file.

DeepSeek Integration

The last change is integrating the popular DeepSeek. DeepSeek provides an OpenAI compatible API, so we can use OpenAI client to work with it. I added a Spring profile to use DeepSeek. For the model, we can use either deepseek-reasoner for DeepSeek R1, or deepseek-chat for DeepSeek V3. DeepSeek R1 takes a longer time to respond, and doesn't support function calling or JSON output. DeepSeek V3 supports function calling and JSON output, but has some issues.

That's all for the updates. The source code has been updated. You can get a copy of the source code from the fifth lecture.

Use pgvector​

Read Environment Variables​

DeepSeek Integration​

Use pgvector

Read Environment Variables

DeepSeek Integration