

Automating Context File Updates via Git Hook
For my hobby projects, I usually have an AI generated context file that captures what the project does and how it’s structured. This helps me (or an agent) get up to speed quickly. The problem starts once I make a lot of progress into the project, I end up forgetting about the context file, and it becomes outdated pretty fast. So instead of trying to be more disciplined about documentation, I decided to automate the problem away....
Semantic Word Ladder: Embeddings and A* Search
I hadn’t written in a while and to get over my writer’s block, I decided to pick up a small weekend project and have some fun with it. I started with embeddings and tried to think of them as a space you could move through. Each word becomes a node and the goal is to get from a start word to a target word through small semantic steps. I embedded a dataset of common words and treated the whole thing as a graph traversal problem using A* search algorithm....
Running Ephemeral Dataproc Clusters on Airflow using GCP Composer
What is Dataproc? Dataproc is Google Cloud’s fully managed service for running Apache Spark, Hadoop, and other open-source data processing tools. It excels at handling large-scale data processing and pipeline operations. Ephemeral Dataproc clusters are temporary clusters that: Are created on-demand when processing is needed Automatically terminate once their tasks are complete Help optimize costs by only running when necessary Can be managed through automation tools like Airflow or Terraform What is Cloud Composer?...
Socket IO & Adapter When Using Multiple Servers
Why Websockets? Limitations Of HTTP With HTTP, client requests a resource and the server responds with the requested data. It is a unidirectional communication - the data must be first requested by the client. A workaround for this limitation was the HTTP long polling. With long polling, client makes a http request with a long timeout period and the server uses that time period to keep sending data to the client....
Docker: Images & Layers
What Is Docker? Docker is a tool to build, deploy and run applications, on any platform irrespective of the infrastructure, with the use of containers. A container is a unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. Since docker packages all the dependencies of an application together with the required version, it solves the problem of having missing dependencies or version conflicts on developer’s local environment or production environment....
Node JS: Event Loop
Despite being single threaded, Node-Js allows asynchronous programming and non-blocking operations by using Event loop. Node has an event driven architecture and code is executed is in the form of callbacks. The Event Loop schedules the callbacks to be run in the single thread at a given point. Libuv is the library that provides the implementation of event loop and thread pool in Nodejs. When a node process starts execution, the top level code is executed first, and all the callbacks are registered....
JS Engine And Execution Context
Javascript engines are programs that convert Javascript code into native machine code. These engines are embedded into the browser for runtime compilation and execution of the code. Google chrome uses V8 engine, Safari uses JavaScriptCore, and Firefox uses SpiderMonkey. To transform and run code faster, modern javascript engines use Just In Time (JIT) compilation, which is a combination of both interpretation and compilation. How JS Engine Works Parser...
JUnit: Misuse Of any() Matchers
For writing unit tests in JUnit, Mockito provides argument matchers that you can use to mock the arguments in a given() or when() statement. Among the matchers provided, the any() matcher lets you pass any argument to the method, ensuring that the test cases passes with any input. When any() matcher is used extensively, out of laziness of specifying the parameters, there are high chances that the test case is not really testing anything or it passes even with a wrong input....
JPA: Persistence Context And Dirty Check Mechanism
Few entities were being persisted to database in certain cases, without explicitly calling save() or without having @Transactional annotation. Digging a little deeper, I realised this behaviour is due to persistence context and dirty check mechanism. Persistence Context Observation 1: Two entities, user and account, are retrieved from database and fields of both entities are modified. When save() was called only on account entity, both the entities were being persisted to database....
Internals Of Java Parallel Streams
The Stream API was introduced in Java 8 as an efficient way to operate on collections. Parallel streams were introduced as a part of it for parallel processing and to make the application run faster. Though parallel streams are supposed to increase your application performance by splitting the task between multiple threads and completing them faster than sequential execution, there are chances parallel streams can sometimes slow down the entire application....