Raveesh's Notes

All my highlights and notes from books, articles, blogs and Tweets. Share your thoughts with me on Twitter or Mastdodon.

You can subscribe to the contents in your favorite RSS Feed reader here. I recommend Readwise's Reader, Feedly and Reeder.

The Current Climate in AI Has So Many Parallels To...

2 highlights/notes in François Chollet's post The Current Climate in AI Has So Many Parallels To...

The bull case is that generative AI becomes a widespread UX paradigm for interacting with most tech products (note: this has nothing to do with AGI, which is a pipe dream). Near-future iterations of current AI models become our interface to the world's information.


The bear case is the continuation of the GPT-3 trajectory, which is that LLMs only find limited commercial success in SEO, marketing, and copywriting niches, while image generation (much more successful) peaks as a XB/y industry circa 2024. LLMs will have been a complete bubble.


The Single Most Undervalued Fact of Linear Algebra: Matrices Are...

1 highlights/notes in Tivadar Danka's post The Single Most Undervalued Fact of Linear Algebra: Matrices Are...

To sum it all up, this was the haiku I wrote when I first discovered the connection between graphs and matrices:

"To study structure,
tear away the flesh, until
only the bone shows."


Transformers From Scratch

8 highlights/notes in Brandon Rohrer's post Transformers From Scratch

Transformers were introduced in this 2017 paper as a tool for sequence transduction—converting one sequence of symbols to another. The most popular examples of this are translation, as in English to German. It has also been modified to perform sequence completion—given a starting prompt, carry on in the same vein and style.


In one-hot encoding a symbol is represented by an array of mostly zeros, the same length of the vocabulary, with only a single element having a value of one. Each element in the array corresponds to a separate symbol.


One really useful thing about the one-hot representation is that it lets us compute dot products. These are also known by other intimidating names like inner product and scalar product. To get the dot product of two vectors, multiply their corresponding elements, then add the results.

Dot product illustration

Dot products are especially useful when we're working with our one-hot word representations. The dot product of any one-hot vector with itself is one.


This trick of using a one-hot vector to pull out a particular row of a matrix is at the core of how transformers work


One useful way to represent sequences is with a transition model. For every word in the vocabulary, it shows what the next word is likely to be. If users ask about photos half the time, files 30% of the time, and directories the rest of the time, the transition model will look like this. The sum of the transitions away from any word will always add up to one.

Markov chain transition model

This particular transition model is called a Markov chain, because it satisfies the Markov property that the probabilities for the next word depend only on recent words. More specifically, it is a first order Markov model because it only looks at the single most recent word.


The combination of the most recent word with each of the words that came before makes for a collection of applicable rows, maybe a large collection. Because of this change in meaning, each value in the matrix no longer represents a probability, but rather a vote. Votes will be summed and compared to determine next word predictions.


To improve our results, we can additionally force the unhelpful features to zero by creating a mask. It's a vector full of ones except for the positions you'd like to hide or mask, and those are set to zero


This process of selective masking is the attention called out in the title of the original paper on transformers


What Will Happen in 2023

1 highlights/notes in Fred Wilson's post What Will Happen in 2023

Good businesses with product market fit, positive unit economics, and strong leadership teams will raise capital although it will be at the new normal in terms of valuation. I believe that “new normal” is more or less where we were in 2015 where seed rounds were done around $10mm, A rounds were done around $15mm to $25mm, B rounds were done around $25mm to $50mm, and growth rounds had a cap at 10x revenues


The Rise of AI Leadership in the Enterprise

3 highlights/notes in linkedin.com's post The Rise of AI Leadership in the Enterprise

Traditional software leaders typically take a linear approach to product development with a clear vision of how the product will work and what it will do based on some articulation of “use cases” or “user journeys.” The development cycle of AI systems and products is iterative. AI leadership requires in-depth knowledge and understanding about when and how to strategically use AI, access to meaningful data, and a nonlinear approach.


successful AI executives know when AI can help a project and how to gradually evolve the sophistication of AI systems. They also know when AI is not the answer.


Experimentation is a critical skill, and it’s one that I have found missing in traditional software engineering leaders, who often view the path from start to finish as straight. Effective AI leaders recognize that there are many unknowns, so they structure projects to front-load the highest-risk areas. They use a hypothesis mindset and build MVPs (Minimum Viable Products) to test these hypotheses quickly to “fail fast.”


Good News About Cheese — It’s Much Healthier Than You Thought

8 highlights/notes in Stephanie Clarke's post Good News About Cheese — It’s Much Healthier Than You Thought

Cheese is packed with nutrients like protein, calcium and phosphorus, and can serve a healthy purpose in the diet,” says Lisa Young, an adjunct professor of nutrition at New York University.


Old-school thinking on nutrition has been focused on individual nutrients — such as fats or protein — that either promote or prevent disease. It’s not clear that this is the wrong approach, but nutrition experts are now putting more emphasis on the entire food and how its structure, nutrients, enzymes and other components interact with one another.


In 2018, Feeney led a six-week clinical trial in which 164 people each ate an equal amount of dairy fat either in the form of butter or cheese and then switched partway through the study. “We found that the saturated fat in cheese did not raise LDL cholesterol levels to the same degree as butter did,” she says.


Some studies show that the mineral content in cheese, particularly calcium, may bind with fatty acids in the intestine and flush them out of the body,” Feeney says. Other studies suggest that fatty acids called sphingolipids in cheese may increase the activity of genes that help with the body’s breakdown of cholesterol.


One reason cheese may help control weight is that it may reduce appetite more than other dairy products.


In a study of more than 145,000 people in 21 countries, the researchers found that eating two daily servings of full-fat dairy or a mix of full-fat and low-fat was linked to a 24 and 11 percent reduced risk of both conditions compared with eating none. Eating only low-fat dairy slightly raised the risk. And among people who didn’t have diabetes or hypertension at the start of the nine-year study, those who ate two servings of dairy each day were less likely to develop the diseases during the study.


the health benefits of cheese were found to be the greatest when it replaced a less healthful food like red or processed meats.


Incorporating cheese into a Mediterranean-style diet where you also include fruits, veggies, whole grains and other foods known to lower disease risk is going to be the most beneficial to your overall health,”


The Most Successful and Introspective People I Know Conduct Annual...

1 highlights/notes in Steve Schlafman's post The Most Successful and Introspective People I Know Conduct Annual...

1/ What were your major moments, milestones, and memories in 2022?


2/ What am I most proud of personally and professionally? Why?


3/ How am I different today than I was a year ago?


4/ What were my biggest lessons learned this year?


5/ What 3-5 words or phrases would I use to describe 2022?


6/ What am I most grateful for in 2022?


7/ Which dimensions in my life am I motivated to focus on in 2023?

🏃🏽‍♀️Health
👨‍👩‍👧‍👧 Family
👬 Friends
❤️ Romance
💰 Money
👩🏻‍💻 Career
💫 Spirituality
📚 Personal Growth
📈 Professional Growth
🕺🏻Leisure + Play
🖥️ Technology
🏡 Environment


8/ What big questions about my life am I asking myself?


9/ If I knew I couldn't fail, how would I spend my time, energy and attention in 2023?


10/ What attitudes, beliefs, behaviors and relationships do I want and need to shed?


11/ What do I want and need to say no to?


12/ What habits and behaviors do I want to start, stop and continue in the new year?


13/ How will I challenge myself in 2023?


14/ What relationships in my life will I prioritize in 2023?


15/ What is my purpose in 2023?

For inspiration 👇
schlaf.co/ambition/


16/ What 1-3 goals do I REALLY want to accomplish?

What's important about these goals to me?


17/ What does success look like?

How will I be different when I've achieved each one?


18/ Why might I fail to hit each goal?

What resistance might I encounter for each of my goals?

For reference 👇

schlaf.co/resistance/


19/ Who in my social circle and professional network can provide help, support and accountability?


20/ What 3 very specific but small steps can I take to get started and build momentum?



Solar System Model

1 highlights/notes in xkcd.com's post Solar System Model

The Earth is, on average, located in the habitable zone, but at any given time it has a certain probability of being outside it, which is why life exists on Earth but is mortal.


Some Really Cool (Programming) Tech That I Think Are Underrated...

2 highlights/notes in tobi lutke's post Some Really Cool (Programming) Tech That I Think Are Underrated...

remix.run is the ideal way to make js based web development get the best from the server, the best from the client, and the most from the browser. It's as much fun to write remix as it is to write rails apps and that's a 10x improvement for JS imo.


Passkeys, you can play around with them on passkeys.io but hopefully you will see them everywhere soon


Lionel Messi Is Impossible

4 highlights/notes in Benjamin Morris's post Lionel Messi Is Impossible

It’s not possible to shoot more efficiently from outside the penalty area than many players shoot inside it. It’s not possible to lead the world in weak-kick goals and long-range goals. It’s not possible to score on unassisted plays as well as the best players in the world score on assisted ones. It’s not possible to lead the world’s forwards both in taking on defenders and in dishing the ball to others. And it’s certainly not possible to do most of these things by insanely wide margins.

But Messi does all of this and more.



Somehow, Messi has done even better when taking it on his own than when somebody sets him up. Moreover, on unassisted shots he shoots nearly 10 percent and .044 GAA better than the next best player (Sergio Aguero for Manchester City) does, despite taking the fourth-most such shots of the 28 players in the group.


the main stylistic difference between Messi and Ronaldo: Ronaldo takes more mid-range shots but misses a lot of them; Messi tries to beat a lot more defenders, loses sometimes, and then makes up for it (and then some) by having better assisting and shooting opportunities as a result


Based on the Conversations I’ve Been Having Lately With People...

1 highlights/notes in Jack Altman's post Based on the Conversations I’ve Been Having Lately With People...

the end of free money means the end of easy growth. Average performance will no longer yield exceptional business results, and so companies are going to need to seek excellent performance out of necessity.


How to Kickstart and Scale a Marketplace Business

3 highlights/notes in Lenny Rachitsky's post How to Kickstart and Scale a Marketplace Business

Everyone's always looking for the hack - what's the channel that will unlock something big? But every time we looked for a reason we weren't growing, it always came back to the basics –– selection, delivery quality, pricing. That's it. Always come back to first principles. Whenever we made a mistake, we forgot this.


with the exception of one company, every single marketplace that I interviewed constrained their initial marketplace to more quickly get to critical mass


The conventional wisdom was to narrow focus to a category (like Amazon did with books) or geography (like Yelp did with San Francisco). We didn't get funding for many years in part because we did the opposite -- we did all categories and all geographies from the beginning. People thought that wouldn't work, that we were boiling the ocean. In retrospect it was the only way to build a marketplace in our space at scale -- being broad in category increased the frequency of use of our product from once every couple years (how often do you need to hire a house painter?) to 8-12 times a year (the number of Thumbtack services an average American household hires annually). And being broad in geography allowed us to scale our marketplace as fast as possible, giving us the revenue, traffic, and thus experimental velocity we needed to bootstrap a great product


Overview & Applications of Large Language Models

4 highlights/notes in Leigh Marie Braswell's post Overview & Applications of Large Language Models

We then discussed how most route coding tasks - syntax questions you’d typically google, basic helper functions, & repetitive if/else branches - had become automated with Copilot’s suggestions, which ended up saving each of his technical teammates hours each day


I constantly have to remind myself that “modern” ML is nascent - 2011 was the first year a convolutional neural net (a type of deep learning model) won the most popular computer vision competition. Transformers, which power the LLMs mentioned above, were introduced by Google Brain in 2017


to train other types of LLMs (predicting specific software actions, answering healthcare questions, etc.) we need to figure out a way to generate enough relevant training data


important to note that the competitive advantage isn’t just the private data used to train the model initially, but the additional data you get when customers interact with the model, telling it what is right, wrong, and sometimes what the answer should be instead.


The Atomic Network

7 highlights/notes in Lenny Rachitsky's post The Atomic Network

the “atomic network” is the smallest network needed that can stand on its own. It needs to have enough density and stability to break through early anti-network effects, and ultimately grow on its own. I liken it to an atom because it is the unit upon which larger networks are ultimately built. If you can build one, and then another, you can build the rest of the network—this is the base unit to build everything else.


The networked product should be launched in its simplest possible form—not fully featured—so that it has a dead simple value proposition. The target should be on building a tiny, atomic network—the smallest that could possibly make sense—and focus on building density


The atomic network is a complementary point of view to Clayton Christensen’s Disruption Theory. These small networks often grow in niches, slowly growing to take over the entire market.


The first step to launching an atomic network is to have a hypothesis about what it might look like. My advice: Your product’s first atomic network is probably smaller and more specific than you think. Not a massive segment of users, or a particular customer segment, or a city, but instead something tiny, maybe on the order of hundreds of people, at a specific moment in time.


The more users you need to get to an atomic network, the harder it is to create.


The concept of atomic networks is powerful because if you can build one, you can probably build two. Each one often becomes easier, because each network can be intertwined with the next—Slack’s success within one company can help it become successful in another, as employees move about and introduce the product to new workplaces. Facebook’s early campus launches became easier over time as students’ friends at different schools began to demand the product more and more. Build a few atomic networks, and soon you can copy and paste them into many, many markets.


Thanks, Andrew! Grab your own copy of The Cold Start Problem, and follow Andrew on Twitter @andrewchen.


How to Get Rich (Without Getting Lucky):

4 highlights/notes in Naval's post How to Get Rich (Without Getting Lucky):

Seek wealth, not money or status. Wealth is having assets that earn while you sleep. Money is how we transfer time and wealth. Status is your place in the social hierarchy.


You’re not going to get rich renting out your time. You must own equity - a piece of a business - to gain your financial freedom.


You will get rich by giving society what it wants but does not yet know how to get. At scale.


Don't partner with cynics and pessimists. Their beliefs are self-fulfilling.


Five Founders

2 highlights/notes in Paul Graham's post Five Founders

Few know this, but one person, Paul Buchheit, is responsible for three of the best things Google has done. He was the original author of GMail, which is the most impressive thing Google has after search. He also wrote the first prototype of AdSense, and was the author of Google's mantra "Don't be evil."


PB made a point in a talk once that I now mention to every startup we fund: that it's better, initially, to make a small number of users really love you than a large number kind of like you. If


Who's Responsible When Recommendations Kill?

2 highlights/notes in Casey Newton's post Who's Responsible When Recommendations Kill?

the company sends every video with about 3,000 views (it varies by country) to one of its 40,000 human moderators for review — something I don’t think a single other platform of its scale has committed to doing. It also works in various ways to detect users under the age of 13, who may be at the highest risk of harm from blindly attempting dangerous stunts, and terminate their accounts. (The company said it removed 41 million such accounts in the first half of this year.)


it’s also possible that the court could rule that platforms’ recommendations are not covered by Section 230, which would have dramatic implications.


Ask HN: Why Did Google Videos Fail?

1 highlights/notes in ycombinator.com's post Ask HN: Why Did Google Videos Fail?

Susan Wojcicki was running it at the time and she originally thought Google Video would succeed because Google was "playing nice" by negotiating legal licenses with broadcasters like NBC. So in theory… Google Video would have the high-end desirable content (e.g. tv shows) that would be "better quality" than the amateur home videos on Youtube.