Evan Yee

Why Am I Doing This

For our fundraising process, I did a bit of freestyle writing that some VCs seemed to appreciate. It felt refreshing to attempt to formalize some of my oft-abstract and messy thinking. However, despite the stream-of-consciousness nature of those memos, they were still creatively and structurally burdened by an underlying motivation to imbue conviction in capital allocators (many of whom made me realize VC brains are bimodally distributed).

Regardless, I rediscovered the importance of writing in my cognitive health. Even though I spent little of my free time during college writing, and took the bare minimum for humanities credits (turns out a triple major in ECE, Math, and CS + premed requirements won't make you a renaissance man), it has always held a special place in my heart.

Over the past few months, I've spent a lot of time formulating hypotheses and testing them technically and socially. Building something both novel and with high utility is unsurprisingly challenging. Yet, I realized I have little documentation of my challenges, my hypotheses, and my ever-changing worldview.

Because there is an absence of documentation, I suffer from failed opportunities to effectively learn from my experiences and commit those learnings to my decision-making process and intellectual repertoire.

For myself, this writing will primarily be a thought exercise and an attempt to apply structure to to my unstructured mind. Although many founders (and seemingly everyone in tech/venture) purport writing as a means of organizing thinking, I'd contend that the true motive is self-marketing: hiring, dealflow, customers. It appears, in today's world, long-form blogs = knows what they're talking about.

Importantly, my writing will be non-publicized (that means no LinkedIn promotions), only on my website blog, and subject-agnostic. It often will have nothing to do with my work. I believe posting does enable a pull mechanism yet still avoids a push that arises from Substack / LinkedIn / Twitter. It will also be free of any AI-assistance (even in researching material). This last fact is important for the preservation of my sanity. In my day-to-day, I already use Claude and Gemini almost gluttonously. This routine will serve as a pseudo-escapist ritual...no SEO optimization, no LinkedIn virality, no intellectual crutches. There are no intended audiences and therefore no editing (I'm in markdown without spellcheck).

I do not have a PhD. I have not (yet) made a unicorn nor a grand contribution to the math / science community. I am not a portfolio manager nor a staff-level engineer. I simply have some experience and interest in math, computer architecture, chemistry, quantitative trading, applied AI/ML, and healthcare. This will inform the subject-matter and depth of understanding.

note-to-self: work on brevity. Avoid redundancy. Don't repeat yourse--

Memory, Writing, and Auto-Encoders.

Obviously, lots of my daily dev work is done in Claude Code now. Experientially, it can feel somewhat unstimulating. I imagine this is what most non-technical work feels like: unrigorous trial and error without that satisfaction only derived from using one's brain to solve a puzzle with one's creativity. I'm not entirely sure how to recapture that joy I felt in 350 or 552.

Anyways, I use Claude a lot. One feature I've seen remarkable improvement in is the auto-compacting of my conversations. Little context window capacity is an obvious way to degrade model performance. However, in-context learning is learning by conditioning rather than parameter updates: the model doesn’t change its weights; it uses attention to treat the prompt as a temporary dataset and implement an update-like algorithm in the forward pass. Therefore, this compaction is not some compression of our model weights W. It is merely a compression of the temporary dataset with de-noising.

Claude's compaction takes the LLM's vast in-context learnings and runs a compression. Mentally one can compare this compression to a two-day task with a night of sleep in the middle.

However, because context compression is lossy, we still need to re-contextualize Claude on our codebase. Whether that's specific file structures, code preferences, micro-services, or spec sheets, Claude still "forgets" without proper documentation. Similarly, I believe that written introspection and case studies will be invaluable to preserving a positive beta of intelligence to experience.

I have no neuroscience training, but the practice of writing these blogs almost feels like training an effecient autoencoder for human experience. The input vectors are the physical inputs of my daily life which undergo some bio-sensory signal processing. This prompts some activation function to develop my raw thoughts which are further compressed into human language. This human language serves as the bottleneck layer, which my autoencoder should be able to recreate into thoughts and actions in the physical world.

More formally, we can let X be the (super high-dimensional) latent state of a day—inputs, context, outcomes, and the hidden "why" behind what happened. Let Y be the downstream variable I actually care about: the next decision, the updated hypothesis, the technical concept I hope to apply. Writing is me forcing a compression map c: X → Z, where Z is a short text code under a rate constraint (finite words or attention). Rereading later is a decoder d, but realistically it's d: Z → Ŷ more than d: Z → X̂: I'm not trying to reconstruct every pixel of experience, I'm trying to recover whatever actually changes behavior. In information-theory, this is an Information Bottleneck problem: keep I(Z;Y) high while paying as little I(X;Z) as possible. If I judge it as "memory," it's rate–distortion: pick a distortion function that defines what errors are acceptable (missing timestamps is fine; missing the causal lever isn't) and accept that compression is inherently lossy. Here, the autoencoder analogy is just the same tradeoff with different symbols: shove everything through a bottleneck, you get generalization and speed, but you also get errors—compression artifacts that perturb the end resolution.

Regardless, I think main point here is that, amidst all my rambling, I should practice encoding my experiences to writing simply because it will better train my intellectual VAE and serve as an external memory unit. More interesting topics to come soon.

My Imperfect Memory Compression (why i'm going to write more)

Why Am I Doing This

Memory, Writing, and Auto-Encoders.

Goated papers I should spend more time with: