GenAI Engineer Project - Okan

GenAI Engineer Project - Okan

icon
This track equips learners to proficiently create, integrate, test, and deploy AI-driven applications and prompt engineering solutions, leveraging modern AI models and real-life cloud environments. This is a back-end oriented project.
icon
Project Definition
  • Translator app using AI to translate either documents and text or ask questions about the context of a given document.
icon
Project Links
  • Add any links to resources about your project
icon
Project Resources
  • Add any resources about the project
Resources
icon
MVP
icon
Timeline
Week
Topic
Project Tasks
Learning (Read/Watch/Exercise)
Completed
1
Define project idea and get it approved by Role Expert by EOW.
Watch: A Hackers' Guide to Language Models Jay's Intro to AI How GPT3 Works Generative AI and AI Product Moats Large Language Models in Five Formulas
2
Initialize API and Database. Code quality and code format
- setup a server - setup a database - setup a GitHub repository and connect the project to it. add Git Ignore. - Create an ENV file to store sensitive data (database API Key) - connect server to database - create first API endpoints to test all CRUD operations for one collection in the database. - Use PyLint to ensure a high level of code quality. - Use prettier to ensure alignment in code formatting.
AI APIs • šŸ« Prompt Engineering Courses - Coursera - deeplearning.ai
3
AI Proof of concept
- Proof of Concept with LLM - Create a comparison table between the LLMs in our case study and reach a recommendation of what to use. - Define a UML/diagram of the flow of prompts. - Choosing the Gen stack (frameworks, libraries).
• Langchain • LLM Models Integration šŸ“– Understanding Chains: - Types of chains (sequential, parallel, branching). - Creating and managing chains. - Use cases for different chain types. šŸ“ Exercise - Chatbot (Link) Integrate pre-trained models (e.g., GPT-3) via LangChain. - Customizing and fine-tuning models. - Best practices for model selection and training.
4
AI
• Data Preprocessing and Cleaning Loading Data into LangChain: Methods for importing and loading data. Handling different data formats. Handling Missing Data Data Normalization and Standa rdization
5-6
AI
Building Your Project
• Real life use cases • Langchain deployment
8
Deployment, Pipeline automation and Presentation
- Use Husky or an alternative to run all documentation and code quality of format tools on every push to GItHub. - Deploy a production database. - Deploy your project to Render.com using production ENV variable that points to another database, not the database you used for testing. Add Readme to GitHub explaining: - what is the project, what are the key features - how to install - how to run the project locally - all other relevant commands
• Basic cloud app deployment
V2?
Auth
- Build an auth system using JWTs to allow signup and login. - add a /me endpoint in the authentication system to allow users to fetch their information (using a JWT). - Encrypt passwords in the database. - Add auth-middleware to authenticate user in all relevant API endpoints. - email should be unique.
* Authentication materials @David L. Rajcher
V2?
Unit Testing
- Reach at least 50% test coverage. - Run unit testing automatically on Commit. - Run unit tests on commit, and make sure to -commit the code only if all tests are PASS.
• TDD • Unit testing
icon
MVP requirements
  • AI Model Integration: Integrate pre-trained models (e.g., GPT-3) via LangChain.
  • Data Handling: Create a pipeline for data preprocessing, cleaning, and preparation.
  • Prompt Engineering: Design effective prompts for various use cases using LangChain.
  • API Development: Build APIs to integrate AI models (FastAPI / Flask).
  • Database: Use a database (e.g., PostgreSQL) to store prompts, user data, and responses.
  • Validation and Security: Validate inputs and sanitize data to prevent injection attacks.
  • Deployment: Deploy the app on cloud platforms (Vercel / Render) with ENV variables, repo, git, build pipeline.
  • No UI: Focus on backend functionalities only.
  • Authentication - Minimal version (Firebase).
  • Langchain usage.
icon

V2

icon
V2 optional requirements
  • Testing
  • Data Preprocessing, Embeddings.
  • Integrations
    • Connect GPT with external end-point that we created
    • Connect GPT to Database
  • Authentication: Implement authentication using JWT. (Moved here by Alon)
  • Caching, Token Optimization
  • UI
icon

GitHub

https://github.com/OkanShr/lingomateai >> README.md file for further instructions about installation

icon

Detailed Project Description

First, a login is required as shown in the image. testuser and testpassword is used here as you can see in the console in the image below.

image
image

With a given token, the user is now authenticated and can send requests to the server.

In the following image, using the green file input button, we can select a pdf document from our computer and extract the text from it.

Then we can edit the text and click on the button that divides the input and output. Which executes the translation function as shown below.

image

The following Page is used to ask predefined questions about the context of the given document or text. For this, I used Phi-3. Different models can be used for quicker and better results. GPT-4o mini is also a great alternative for a cheap price. For this project i decided to use a free small language model.

image
image