Useful Language Modeling

coming soon

currently pursuing ideas about how language models can be useful and valuable, which I currently define by possibly:

  • accomplish better outcomes with less resources
  • make something previously hard/impossible to accomplish possible


hypothesis #1: encoding domain expertise into an input-output interface which scales knowledge distribution orders of magnitude better

  • my current hunch is that a finetuned LLM could be effective at encoding first principles of a domain specific dataset
  • if domains could be boiled down to a set of most important ideas which do not change much over time, finetuning models on these ideas could encode them. (even if the principles change, the model could be finetuned again but unlikely to happen at high frequency)
  • if the dataset is mutually exclusive and collectively exhaustive (MECE) and comprehensive in representing the first principles of a domain - regardless of variation in question, the model might be able to produce accurate useful generations based on reasoning from the first principles encoded
  • this could be valuable because now the experts can encode the intuition required, and scale thier expertise to users that no longer need to go through the same learning curve

work in progress:

domain specific model evaluation for first principles encoding

  • formulating MECE taxonomy for questions evaluating domain first principles

completed:

  • data pipeline:
    • audio diarisation of YouTube podcast episodes using Whisper Large
    • LLM transformation of verbose text structure to QA/ShareGPT format
    • generate preference data from teacher models, rated by gpt3.5T and gpt4
  • finetuned Mistral 7B v0.1 non-instruct using HF SFT Trainer with default params
  • DPO using preferences selected by gpt3.5T and gpt4T with constitution I wrote, taking reference from Anthropic RLAIF method
  • Evals (wip)

Long form post on insights coming soon


(WIP) hypothesis #2: powerful tool for thought: process complex information and generate useful insights at a higher throughput


particularly interested in exploring:

  • Agents powered by multimodal context with domain task capabilities
    • working on a research paper explorer with Agent capabilities

(WIP) hypothesis #3: powerful tool for creativity: identify patterns that are tediuos and not immediately obvious



Gigit AI

gigit.ai

Gigit AI is a platform that helps WhatsApp Businesses scale personalised customer interactions with AI-generated messages.


We learnt that a subset of growing businesses that are operationally heavy drive their sales through WhatsApp, but there were insufficient tools built for them to scale. Their go-to solution was to scale their manpower linearly to their demands, which is why we built Gigit to solve this.


worked extensively on retrieval augmented generation (RAG) applications. Developed our own techniques to maximise the quality and reliability of our product, experimented with various interfaces for serving customers. happy to share what we learnt through the process!


I'll always cherish my time at Gigit. We served great customers and innovated intensely, which stretched me to my best.