Skip to content

Tech blog

Show category:

February 20, 2024

RAG system for advanced document search at large enterprises

Navigating company internal documentation with smart search and retrieval augmented generation systems.

Retrieval Augmented Generation (RAG) systems, may be the game-changer, reshaping how corporations tackle industry complexities. Modulai, positioned at the forefront, has forged strategic partnerships in finance and manufacturing, tailoring RAG solutions to meet the unique demands of large enterprises.

Read more
January 22, 2024

AI-Generated imagery in digital and print media for Bonnier News

Is it feasible for current image generation models to produce high-quality, photorealistic visual content suitable for both print in glossy magazines, and digital publishing?

In this blog post, we discuss the framework we used to answer this question and provide a Google Colab notebook with Python code for an automated analysis of gender bias in image-generating models. Spoiler alert: an online A/B test carried out by Bonnier News (a controlled experiment where two variations of an ad are shown to different groups of website visitors), revealed a notable preference for AI-generated images. Specifically, ads with AI-generated images showed a markedly higher click-through rate.

Read more
October 05, 2023

Finetuning GPT3.5 – Rick Sanchez

Large Language Models (LLMs) have taken the world by storm. Most prominent among them, Chat-GPT has disrupted our daily and professional lives. From high-school essays to creative writing, coding, and journalism, we must embrace AI writing assistants’ advancement. While prompt engineering often helps align the output of these models to specific tasks, it can sometimes fall short, particularly when trying to follow a certain style or tone.

Read more
May 03, 2023

Speech emotion recognition

This post focuses on the work we have done in the field of SER in a project conducted with journalists from SvD, one of Sweden's top daily newspapers.

The amount of digitally stored speech in the form of interviews, lectures, debates, radio talk show archives, and podcasts is increasingly available. Owing to this, Speech Emotion Recognition (SER) has grown to be one of the most trending research topics in computational linguistics in the last two decades. SER, as the name suggests, refers to the automatic detection of emotion from audio samples. 

Read more
April 01, 2023

Few-shot information extraction from PDF documents

Text field extraction in unstructured PDF:s using LLM's

A lot of data comes in the form of unstructured text documents. Some examples are invoices, offers, and product data assets such as technical information & manuals. To make this data available for analysis, we often want to extract the textual information in the document and convert it to a structured format. In this post, we will take a look at how LLMs with few-shot prompting can accelerate this process.

Read more
February 24, 2023

Empowering Journalists at SvD with AI Tools for Podcast Analysis

While podcasts offer a convenient and accessible media format for on-the-go consumption, they can present significant challenges when it comes to searching and analysing content, especially compared to traditional text-based media such as news articles and interviews.

In this blog post, we’ll explore the ways in which AI can be used to analyse podcasts and how AI can empower journalists.

Read more
November 11, 2022

Smarter Operations with AI: Detecting Fraud and Boosting User Engagement

AI and machine learning models can be a valuable tool for many businesses. This blog posts highlights a few examples of how machine learning models can be used to protect, increase value and gain insights in your business.

In 2022, we built ML fraud detection models and recommendation systems for the UK-based company On Device Research to complement their platform, Curious Cat, which allows users to find and take surveys.

Read more
September 01, 2022

Leveraging 3D Engines for Data Generation in Deep Learning

Applied to the specific case of scene text detection.

In the field of computer vision, synthetic data generation is especially interesting since the number of relevant resources and tools have grown and improved significantly over the years. The development has not been in the field of machine learning though, but rather in game engines such as Unreal Engine, Blender and Unity. Often produced by professional designers, realistic scenes are produced offering great details.

Read more
August 11, 2022

Explainable AI in medicine: Detecting AF in ECG data

Machine learning models usually do not explain their predictions. This is a significant barrier to adaptation in domains like medicine, where understanding how the model works is vital.

In this blog post, we take a look at how explainability techniques can be used on a deep learning model predicting atrial fibrillation from sinus ECG (electrocardiogram) data.

Read more
December 01, 2021

Zero-Shot Learning in NLP

Recent transformer-based language models, such as RoBERTa, ALBERT, and OpenAI GPT, have shown a powerful ability to learn universal language representations. However, in many real-world scenarios, the lack and cost of labeled data is still a limiting factor.

Zero-shot learning (ZSL) is a form of transfer learning that aims to learn patterns from labeled data in order to detect classes that were never seen during training.  As the lack of labeled data and scalability is a regular problem in machine learning applications, ZSL has gained much attention in recent years thanks to its ability to predict unseen classes. 

Read more
August 31, 2021

MLOps – Deploying a recommender system in a production environment

When developing an integrated ML system, surprisingly little amount of time is spent on actual model development. The majority of time is spent creating the right prerequisites for model deployment – that is MLOps. 

The following sections describe how we work with MLOps at Ahlsell. We’ll go through data infusion and processing, modeling, and evaluation pipelines as well as how we put it all together in an automated CI/CD pipeline.

Read more
December 27, 2020

Graph neural networks

Graph Neural Networks (GNNs) are neural network architectures that learn on graph-structured data. In recent years, GNN's have rapidly improved in terms of ease-of-implementation and performance, and more success stories being reported. In this post, we will briefly introduce these networks, their development, and the features that have lead to their success.

We will dive deeper into three use-cases, citation networks and drug discovery, using the package Deep graph library (DGL), and e-commerce using Pytorch geometric.

Read more
October 06, 2020

Enabling real time e-sport tracking with streaming video object detection

The esports industry has seen tremendous growth lately. Each tournament is streamed live and reaches several million viewers all around the world, increasing the demand for live updates of games, players, e.g. for live betting and more.

To improve the experience of watching these tournaments, Abios Gaming provides an API for live information on games, teams, and players. To strengthen Abios Gaming’s offer, and to enable real-time monitoring of esport games, Modulai joined forces with their tech team to build a deep learning object detection solution, to extract information on-the-fly from real-time video streams of gaming tournaments.

Read more
September 09, 2020

Classification of hypoglycemia causes in blood sugar time series

We give an overview of a study conducted in collaboration with Daniel Espes and Per-Ola Carlsson at Uppsala University aiming to improve the treatment of type 1 diabetes.

Type 1 diabetes is one of the most common chronic disorders among children and adolescents but affects people of all ages and globally more than 5 million people are affected.

Read more