Autonomía digital y tecnológica

Código e ideas para una internet distribuida

Linkoteca. machine learning


Autoregressive text generation word by word

Let’s start by briefly describing what fine-tuning is. It is all about adjusting a model to fit your specific needs by tweaking its weights. Imagine you’re dealing with legal documents where words like ‘consideration’ take on a whole new meaning compared to everyday speech or what the model has been trained on. Fine-tuning steps in to make sure the model gets these specialized terms right. It’s not just about words either — you can also set up the model to follow specific rules, like keeping answers short and to the point, or understanding your business needs to a deeper level. So, if you’re planning to deploy it in production, this process transforms a general-purpose model into something custom-built for your data.

The lightweight models, specifically the 1B and 3B ({N}B = N billion parameters, for positive integer N), are among the most interesting for a variety of reasons. They are relatively easy to run locally, unlike the larger versions of LLMs and Meta claims they can be smoothly deployed on hardware found in mobile devices. This opens the door to many applications, as language processing can be done locally, without data leaving the device. As a result, these models are not reliant on a stable internet connection and offer a more private handling of sensitive information. With these advantages, we can expect to see more tools like personal assistants running on our smartphones.

Ten years ago, Facebook already had 15 billion photos in its database. As you uploaded pictures and tagged friends and added date and location data, the software got really, really good at recognizing people’s faces. This facial-recognition capability is mirrored at other companies—and some, such as Amazon, sell it to whoever wants it.

There isn’t some global corporate conspiracy to get you to post a photo of yourself from the old days and today. There has been a global corporate conspiracy to get you to post everything about yourself, continuously, for the past 15 years. Which many of us have done, providing the vast data sets that companies have already trained their neural networks with. If you think that not posting these two photos does anything to surveillance capitalism or the platforms that succeed through it, that’s just not right.