AI is changing quickly, and a big part of that is the rise of “autonomous agents” - essentially systems which can think for themselves, use different programs, and make choices on their own. This paper looks at two different projects using these kinds of agents for both bringing data together and for natural language processing (NLP). The first uses xAI’s Grok-3-mini, a large language model (LLM) in the cloud, to pull in all sorts of information from lots of different online stores at the same time. It’s set up with a ‘master and worker’ system for getting the information (scraping) without waiting for everything to finish in order, for matching items from different sites without needing to be specifically told what to look for (zero-shot entity alignment), and for keeping the data current and quick to access (using caching). The second project is about using LLMs on the device itself (on the ‘edge’) with Ollama and a ReAct loop to do a reverse image search. This is about being private, working even with no internet connection, and being able to handle multiple languages, and specifically to summarise information in Uzbek. These two projects are quite different in how much they depend on the cloud or on being on your own equipment. In the online store data gathering, Grok’s ability to use other tools allows it to improve what it asks and to get info from elsewhere, and a clever trick with a time-limited cache in Redis makes things much faster. When compared to other ways of doing it, this method is a lot quicker and matches things much more accurately. But the reverse image search agent, with Ollama’s llama3.1, decides on its own which search engines to use (TinEye, Yandex, Bing), then looks at the picture on your device to come up with a description of it - and none of that picture is sent anywhere, so you remain private. From actually building these, a few important things about designing agents become clear: how to get both the ability to handle a lot of work and the ability to work on its own, how important those 'thinking' loops are for deciding what to do, and how to adjust things for languages that don’t have a lot of online resources. And from my own experience learning by doing, it's essential to build things that can still operate if parts of them break, and to consider a mix of cloud and local processing. This work fits with the current exploration of AI that isn’t in one central place, and offers real-world advice for improving these agent systems when they have to work with all kinds of different data.
Building similarity graph...
Analyzing shared references across papers
Loading...
Dostonbek Abdurakhmonov (Mon,) studied this question.
www.synapsesocial.com/papers/69d8946e6c1944d70ce05689 — DOI: https://doi.org/10.5281/zenodo.19451657
Dostonbek Abdurakhmonov
Building similarity graph...
Analyzing shared references across papers
Loading...