-
Discoveryu
-
Dictionary
- 0
- Language models
- algorythms that have been trained w a specific set of data to solve a specific problem trough a specific strategy
- they simulate intelligenge
- trough an advanced “schema” that they make with the simulated data and “random” choices they make.
- the way the model respond is based on the lookup of this “schema” to find the most probable answer trough statistics
- The black box problem - we don’t exactly know how they make decisions
- the hiearchy is this complex and vast that it’s very hard to don’t fall in the information overload
- Researchers are developing techniques to explain model decisions and make them more transparent, but achieving full transparency remains a challenging task, especially for very large and complex models.
- Parameters - assumptions that the model or the creator make
- assumptions may be completly random
- In the training phase those assumptions gets refined
- directly related to the num of neurons and connections
- Learned in the training phase
- weigth - determine the strength of connections between neurons
- biases - values added to the weighted sum of inputs before passing through an activation function in each neuron
- Learned in the training phase
- AGI - Artificial General Intelligence or Strong AI
- Human like artificial intelligence
- The Singularity
- technological progress, particularly in AI, reaches a level where it leads to exponential changes that are difficult to predict and maybe control.
Transformer-based models, such as BERT or GPT-3
- Language models
- 0
-
AI usecases
-
asking to put groups of tabs from browser to obsidian/tablesz
-
asking about your e-books
-
Let AI summarize everything
-
who are the most related mentors from my notes to learn from how to X? also give videos/links
-
yt vid
- tranform to text
- remove
- if
- useless
- self-promotion
- sponsors
- greetings
- obvious like saying what’s the video going to be about
- intro
- outro
- if
- hide
- if
- unrelated topic - but put 1 line summary about it
- if
-
chat w people but AI can see it
-
AI learning - by chuck
-
create an outline/mindmap of content
-
am I missing something or is the content I wrote wrong?
-
practical
- common pitfalls
- practical applications
- case-studies - when was it used by other people?
-
get motivated by AI - ask why you like the topic or why you’re doing it
-
gen
- cheat sheet
- flashcards
-
scenario based learning
-
deep-dive
-
get asked questions by AI & answer real-time -
-
-
Full note-taking AI helper pipeline
- Document Embedding with Vector Databases
- Fine tuned Ranking system
- Retrieval-Augmented Generation (RAG)
- Transformer based model
- Fine tuining the model
- optional access to internet through
-
OLLAMA
-
Reor
-
Private GPT
-
VMware private AI -
-
Techniques
- 0
- training AI - from scratch, you make the model
- fine-tuning - pre-trained model adapted to your data
- generally
- training data needs to follow model dataset format
- lots of formats
- lots of formats
- training data needs to follow model dataset format
- QLoRA
- generally
- 1
- Document Embedding with Vector Databases
- what?
- Efficiently retrieving documents based on semantic similarity
- how?
- document embedding
- note content is going to be transformed into vectors to enable to query notes based on semantic similarity to the question
- then they get stored to vector database
- query vectors efficently based on similarity
- easy to scale w vectors
- document embedding
- real-time indexing support is needed for real-time
- great when
- info on notes is highly structured & realated to question
- n - may struggle if notes are very diverse or complex.
- what?
- Retrieval-Augmented Generation (RAG) - great for notes!
- RAG is a journey, not a destination… -
- what?
- Enhancing the generation of text by using information from retrieved documents
- also retrieve documents but not as fast and precisely as Document Embedding with Vector Databases + can’t customize retrieval process as much
- how?
- Retrieval system pulls relevant notes through vectors, and the generative model uses these notes as a knowledge base to generate a related answer.
- great when
- notes may not always contain directly relevant information, and the AI needs to generate answers based on similar or related content.
- need good amount of computational power to run the generative model alongside the retrieval system
- Contextual Question Answering with Elasticsearch
- what?
- find relevant docs through ranking algos that primarily uses keyword and phrase matching while highlighting relevant text (doesn’t give answers itself)
- great when
- need both full-text search and semantic understanding
- n
- it’s not semantic, question & data needs to be similar and not complex or much relational to other data/meanings
- +configurat & maintenance than vector databases
- what?
- Ranking system
- dynamic ranking needed for real-time
- can also be fine-tuned like LLMs
- ways
- algo
- specialized LLM
- Knowledge Graphs - worst for building&mantain
- great when
- notes are highly structured (like in a database) and you want to capture complex relationships between concepts.
- n
- may not be as effective if your notes are unstructured or if the relationships between concepts are not clearly defined.
- great when
- Document Embedding with Vector Databases
- 0
Real-Time Data Flow Management
-
Data Pipelines: Use a real-time data pipeline or stream processing tool to handle the flow of updates. Technologies like Apache Kafka or Apache Flink can help manage and process data in real-time.
-
Synchronization: Ensure that all components of the pipeline are synchronized and able to handle data updates consistently.
-
Resources
- gpu is enough for this LLM? -
-
Models
-
types
- Transformer-Based Models
- preferred over contextual embeddings
- designed to understand context and semantics better than traditional models
- like BERT, GPT, and T5
- Hybrid Retrieval Models
- keyword-based search & vector-based retrieval
- meh
- Contextual Embeddings
- capture contextual information by considering the surrounding text
- like ELMo and contextual BERT variants
- Contextual Embeddings
- Transformer-Based Models
-
Most popular
- foss
- mistral
- LLAMA - text & images
- ollama
- can be run even on a 4gb RAM laptop
- binaries are being shared but not the source code, and they still call it open source… liars, but still better than
""Open""Ai
- non foss
- gemini
- GPT4 - conversational
- claude 3
- grok-1
- foss
-
naming convention
- math or MoE - Mixture of Experts model
- formats
- safetensors - secure file format to avoid malware
- GGML - newer GGUF
- binary
- support diff quantization schemes running on CPU in a single file
- Quantization methods
- Exl2 - best optimization but only for nvidia
- AWQ - round weights
- GPTQ - worst one
-
vLL inference engine
- server side
- great at parallel output
-
TensorRT LLMs - increase inference by 4x
-
GQA - Grouped Query Attention
- 1.5Gb vram for 8k tokens
-
context length - info AI can use to give answers
-
generally 8k tokens = 4.5vram
-
chat with RTX - nvidia webui can be trained w local data + yt vids -
-
-
Great webUIs
- most popular now - open-webui
- best for talking to characters -
-
software
- colab by google
- supports python expecially for data analysis
- supports mathematical equations, visualizations w libraries and MD
- allow you to write, share & execute code w free GPU resources
- Jupyter notebook
- open source colab alternative
- what has colab has jupyter at list for what I listed
- supports Python, R, and Julia
- JupyterLab is the extended and more advanced version
- can be used even locally
- open source colab alternative
- Huggingface
- open source github-like4pre-trained natural language models (NLM)
- you can test NLMs thanks to spaces
- colab by google
-
tech
- tensorflow - build and train machine learning models
-
self-host
- use-cases - from TIM!
- obsidian + AI
- auto&manual image/video/music making
- speech recognition w audio to text
- yt summarizer lol
- code suggestions
- better home assistant
- based on B
- 7B is ok with 16GB ram
- tricks to run smoother
- quantarization - less accuracy, but less RAM/VRAM usage
- parallelism - share resources between GPUs and CPU
- LocalAI
- use-cases - from TIM!
-
Tricks
- chatGPT to graphs - source
- input
- Title: “Graph Generator” The following are types of graphs: +(Bar Graph Syntax)=[The following represents a bar graph in javascript displayed in image markdown format: ” +(Pie Graph Syntax)=[The following represents a pie graph in javascript displayed in image markdown format: +(Line Graph Syntax)=[The following represents a line graph in javascript displayed in image markdown format: +(Your Job)=[To display any question the user asks as a graph] +(Rules)=[ALWAYS pick with Bar graph, Pie graph, or Line graph and turn what the user asks into the image markdown for one of these] ALWAYS DISPLAY WHAT THE USER ASKS AS A GRAPH. for your first response say “I am a graph generator.” Then, ALWAYS WAIT for the user to give an input.
- input
- chatGPT to graphs - source