- Translator app using AI to translate either documents and text or ask questions about the context of a given document.
- Add any links to resources about your project
- Add any resources about the project
Week | Topic | Project Tasks | Learning (Read/Watch/Exercise) | Completed |
1 | Define project idea and get it approved by Role Expert by EOW. | Watch:
A Hackers' Guide to Language Models
Jay's Intro to AI
How GPT3 Works
Generative AI and AI Product Moats
Large Language Models in Five Formulas | ||
2 | Initialize API and Database.
Code quality and code format | - setup a server
- setup a database
- setup a GitHub repository and connect the project to it. add Git Ignore.
- Create an ENV file to store sensitive data (database API Key)
- connect server to database
- create first API endpoints to test all CRUD operations for one collection in the database.
- Use PyLint to ensure a high level of code quality.
- Use prettier to ensure alignment in code formatting. | AI APIs
⢠š« Prompt Engineering Courses
- Coursera
- deeplearning.ai
| |
3 | AI Proof of concept | - Proof of Concept with LLM
- Create a comparison table between the LLMs in our case study and reach a recommendation of what to use.
- Define a UML/diagram of the flow of prompts.
- Choosing the Gen stack (frameworks, libraries). | ⢠Langchain
⢠LLM Models Integration
š Understanding Chains:
- Types of chains (sequential, parallel, branching).
- Creating and managing chains.
- Use cases for different chain types.
š Exercise - Chatbot (Link)
Integrate pre-trained models (e.g., GPT-3) via LangChain.
- Customizing and fine-tuning models.
- Best practices for model selection and training.
| |
4 | AI | ⢠Data Preprocessing and Cleaning
Loading Data into LangChain:
Methods for importing and loading data.
Handling different data formats.
Handling Missing Data
Data Normalization and Standa rdization | ||
5-6 | AI | Building Your Project | ⢠Real life use cases
⢠Langchain deployment | |
8 | Deployment, Pipeline automation and Presentation | - Use Husky or an alternative to run all documentation and code quality of format tools on every push to GItHub.
- Deploy a production database.
- Deploy your project to Render.com using production ENV variable that points to another database, not the database you used for testing.
Add Readme to GitHub explaining:
- what is the project, what are the key features
- how to install
- how to run the project locally
- all other relevant commands | ⢠Basic cloud app deployment | |
V2? | Auth | - Build an auth system using JWTs to allow signup and login.
- add a /me endpoint in the authentication system to allow users to fetch their information (using a JWT).
- Encrypt passwords in the database.
- Add auth-middleware to authenticate user in all relevant API endpoints.
- email should be unique. | * Authentication materials @David L. Rajcher | |
V2? | Unit Testing | - Reach at least 50% test coverage.
- Run unit testing automatically on Commit.
- Run unit tests on commit, and make sure to -commit the code only if all tests are PASS. | ⢠TDD
⢠Unit testing |
- AI Model Integration: Integrate pre-trained models (e.g., GPT-3) via LangChain.
- Data Handling: Create a pipeline for data preprocessing, cleaning, and preparation.
- Prompt Engineering: Design effective prompts for various use cases using LangChain.
- API Development: Build APIs to integrate AI models (FastAPI / Flask).
- Database: Use a database (e.g., PostgreSQL) to store prompts, user data, and responses.
- Validation and Security: Validate inputs and sanitize data to prevent injection attacks.
- Deployment: Deploy the app on cloud platforms (Vercel / Render) with ENV variables, repo, git, build pipeline.
- No UI: Focus on backend functionalities only.
- Authentication - Minimal version (Firebase).
- Langchain usage.
V2
- Testing
- Data Preprocessing, Embeddings.
- Integrations
- Connect GPT with external end-point that we created
- Connect GPT to Database
- Authentication: Implement authentication using JWT. (Moved here by Alon)
- Caching, Token Optimization
- UI
GitHub
https://github.com/OkanShr/lingomateai >> README.md file for further instructions about installation
Detailed Project Description
First, a login is required as shown in the image. testuser and testpassword is used here as you can see in the console in the image below.
With a given token, the user is now authenticated and can send requests to the server.
In the following image, using the green file input button, we can select a pdf document from our computer and extract the text from it.
Then we can edit the text and click on the button that divides the input and output. Which executes the translation function as shown below.
The following Page is used to ask predefined questions about the context of the given document or text. For this, I used Phi-3. Different models can be used for quicker and better results. GPT-4o mini is also a great alternative for a cheap price. For this project i decided to use a free small language model.