Machine Learning

64K members Est. Mar 31, 2022 Updated Feb 10, 2026
ƬⲘ ⚔️ @tm23twt · Feb 6
timelapse 3

- was going through lilian weng policy grad blog
- covered till a3c, went through pg theorem proof & found another great blog
- next up is dpg, ddpg & d4pg
- haven't implemented them only theory as of now but will do it ofc
- & this timelapse looks good niceee :) https://t.co/hyt6yB4mQi
6
1
62
15.7K
6
ƬⲘ ⚔️ @tm23twt · Feb 5
timelapse 2

i think session wise timelapses are good & then will merge one for the day.
still need to speed up things, anyways this one's better ig :) https://t.co/kj9zm8cjNU
7
1
66
2.9K
5
Sebastian Buzdugan @sebuzdugan · Feb 4
ppo: proximal policy optimization is the math that allows the model to learn from the reward model without changing its personality too drastically. it keeps the updates small and stable so the model doesn't go off the rails. [56/100]
29
18
83
35.7K
3
Sebastian Buzdugan @sebuzdugan · Feb 4
reward model: during rlhf, we actually train a second judge model that learns to predict what a human would like. this judge then watches the main model work and gives it a thumbs up or thumbs down millions of times. [55/100]
28
10
86
16.3K
1
ƬⲘ ⚔️ @tm23twt · Feb 4
test timelapse ig, bruh there is hell lot of light source need to cover it or make it translucent :) https://t.co/r7N5dbkIdk
6
2
87
3.9K
8
Sebastian Buzdugan @sebuzdugan · Feb 3
rlhf: reinforcement learning from human feedback is the secret sauce for alignment. we have humans rank different ai responses and then train the model to maximize the score given by those human preferences, making it safer. [54/100]
27
9
61
8.8K
0
Sebastian Buzdugan @sebuzdugan · Feb 3
instruction tuning: this is a specific type of fine-tuning where we teach the model to follow specific commands, like summarize this or write a poem. it is what makes ai feel like a tool rather than just a generator. [53/100]
20
6
65
30.7K
2
Sebastian Buzdugan @sebuzdugan · Feb 2
training loss curve: engineers spend their lives staring at a downward graph of loss. if the line stops going down, it means the model has stopped learning or has hit a bottleneck, and it is time to change the architecture or data. [51/100]
24
13
66
24.4K
1
Sebastian Buzdugan @sebuzdugan · Feb 1
pre-training: this is the most expensive phase where the model reads the entire internet. it is where it learns the fundamental structure of human thought and language, before it is taught any specific tasks or rules. [50/100]
29
13
67
24.5K
1
Sebastian Buzdugan @sebuzdugan · Feb 1
speculative decoding: we can speed up giant models by having a tiny assistant model guess the next five words and then asking the giant model to verify them all at once. if the small model was right, we save a lot of time and compute. [49/100]
25
14
76
28.2K
1
Sebastian Buzdugan @sebuzdugan · Jan 29
flash attention: one of the biggest bottlenecks in ai is moving data around the gpu. flash attention is a brilliant mathematical re-write of the attention process that makes it fit into the fastest memory on the chip, speeding up training by several times. [43/100]
25
20
68
24.3K
1
Sebastian Buzdugan @sebuzdugan · Jan 28
quantization: this is the art of squashing giant models down by using 4-bit or even 2-bit numbers. it is what allows you to run a massive, world-class ai on a simple laptop or phone that would otherwise be way too weak. [42/100]
33
28
101
20.8K
2
Sebastian Buzdugan @sebuzdugan · Jan 28
floating point precision: ai models are usually built with very precise 16-bit or 32-bit numbers, but we are finding that we can often use much less precision without the model getting dumber, which saves massive amounts of power and memory. [41/100]
27
14
65
13.9K
1
Sebastian Buzdugan @sebuzdugan · Jan 26
perplexity: this is a mathematical measurement of how confused a model is by a piece of text. a lower perplexity means the model found the text very predictable and easy to understand, which is a sign of a well-trained brain [37/100]
26
24
66
27.2K
1
Sebastian Buzdugan @sebuzdugan · Jan 24
HeartMuLa shows why music generation is not just “text to audio”

- separate global structure from local detail
- compress audio into semantic tokens
- generate minutes of music autoregressively
- this design pattern will show up everywhere https://t.co/Emllhjxyas
23
28
86
45.9K
7
Sebastian Buzdugan @sebuzdugan · Jan 24
adam optimizer: standard gradient descent is a bit clunky, so we use adam. it acts like a smarter version that remembers the momentum of past updates and adjusts the speed for each individual weight, making the whole training process much smoother [33/100]
18
21
73
26.0K
2
Dan Kornas @DanKornas · Jan 22
AI Engineers: stop shipping notebook spaghetti into production!

Python Object-Oriented Programming (5th Edition) is a practical guide that teaches you how (and when) to apply OOP to build scalable, maintainable Python applications.

I reviewed this book because I keep reminding https://t.co/95SSTUs8QM
Tweet media
1
13
126
26.3K
87
Dan Kornas @DanKornas · Jan 19
Build a Reasoning Model (From Scratch) https://t.co/RnxQ2z9VzH
Tweet media
6
49
333
46.3K
229
mohit @mohitwt_ · Jan 19
I'm working with @RubenVeidt on my deep learning framework to support his image editor and more

because of that reason we are rewriting this framework to support future features and more features specific to images, videos etc... not just text and ofcourse we will use no
8
2
60
16.6K
4
Sebastian Buzdugan @sebuzdugan · Jan 16
attention scores: the model calculates a score for every pair of words in a sequence to decide how much focus to put on each one it divides these scores by a constant number to keep the math stable so the training doesn't fail when the model gets deep [15/100]
40
16
59
45.8K
1

Sebastian Buzdugan

@sebuzdugan

ML Engineer | PhD Student in AI | Building @getfrai

2.7K Followers
14 Contributions

ƬⲘ ⚔️

@tm23twt

dead or alive . research intern @UOsaka_ja

3.0K Followers
3 Contributions

Dan Kornas

@DanKornas

AI/ML Engineer AI Notes: https://t.co/lC2UKMtRjj Youtube: https://t.co/pjpX8NvUn5 Newsletter: https://t.co/NMMvPSmzua

89.9K Followers
2 Contributions

mohit

@mohitwt_

building custom dl framework from scratch

972 Followers
1 Contributions
64.3K
Total Members
+ 12
24h Growth
+ 144
7d Growth
Date Members Change
Feb 10, 2026 64.3K +12
Feb 9, 2026 64.3K +38
Feb 8, 2026 64.3K +14
Feb 7, 2026 64.2K +20
Feb 6, 2026 64.2K +33
Feb 5, 2026 64.2K +27
Feb 4, 2026 64.2K +16
Feb 3, 2026 64.1K +17
Feb 2, 2026 64.1K +25
Feb 1, 2026 64.1K +9
Jan 31, 2026 64.1K +13
Jan 30, 2026 64.1K +18
Jan 29, 2026 64.1K +21
Jan 28, 2026 64K

No reviews yet

Be the first to share your experience!

Community Rules

Be kind and respectful.
Keep Tweets on topic.
Explore and share.