By Teresa Meile Bailey

In part 1, we discussed how to uncover the project’s underlying problems. Now, we’ll share six specific strategies to give projects more momentum. These strategies can be led by people in product, user experience, development, marketing, sales, or other influential roles. You can also proactively use these six strategies to keep projects from stalling out in the first place:

By Teresa Meile Bailey

Whether you manage, design, or build projects, you will find yourself in situations where a project gets put on hold — officially or unofficially — for one reason or another. Sometimes it’s obvious why it’s on hold, and sometimes there’s a misunderstanding that needs to be resolved. Whether it’s a prioritized project that must move forward, or a pet project that you want to advance, this two-step approach will help you and your team get it going again.

● In this post, we’ll discuss how to uncover a project’s underlying challenges and early steps to address…

By Matt Chudoba

At Integral Ad Science, we process high volumes of partner data through our platform every day, so that we can quickly provide meaningful insights to customers. Our partners control when the data is made available and sometimes there are delays due to API outages or planned downtime.

This two-part series will share an overview of our data intake and processing pipelines, focusing on how we built automatic monitoring and tools to handle late arriving data most effectively.

This post addresses how we designed our architecture to accommodate late arriving data, while the next outlines specific steps in…

By Karishma Agarwal

Integral Ad Science (IAS) processes trillions of data requests globally each month, with several mapping tables in MySQL to manage these events in real-time. We typically use a distributed cache such as Redis or Aerospike to keep this data in memory and perform real-time lookups with very high throughput. However, we recently worked on a project to evaluate two new approaches:

We wanted to add a cache to the existing pipeline for storing data from the mapping tables in MySQL and use it…

by Feng Fan

Recently I worked on a project to evaluate different left-join options for a Spark application we are building to modernize our largest data pipeline. The pipeline processes about 2B events per hour, creating a data set of about 0.5B records. There was a long running left-join operation that took 20 minute to finish using Pig over MapReduce in the old pipeline. My task was to benchmark this left-join operation with different Spark join options. This article shares the learnings I gathered during that project.

Dataset Sizes and Test Environment

On the left side of the join we had a big dataset of…

by Akshay Tambe

At Integral Ad Science (IAS), we measure over 100 billion data events daily, giving our customers unmatched scale, coverage, and accuracy. We process this data with hundreds of big data processing and data science pipelines. As we’ve continued to scale globally, IAS migrated to a cloud-based infrastructure hosted on Amazon Web Services (AWS), resulting in cost savings and increased performance. One great strategy to control and reduce AWS costs is to leverage spot instances.

Spot Instances are spare EC2 instances in the AWS Cloud which are offered at up to 90% cost savings compared to on-demand instances…

Is banana bread a bread? An unexpected UXR journey in cookbook design.

by Joey Stempel

After organizers announced they were putting together a company cookbook, I was quick to volunteer. As a UX Designer who loves to cook and consumes a significant amount of food media, this was too perfect of an opportunity to pass up. Only I didn’t realize just how helpful UX Research would be; by the end, I had conducted competitive research, run a card sorting exercise, and applied user-centered design principles.

In early discussions with the team, the first question to arise was: how should the…

by Yuva Mahendran

At Integral Ad Science we constantly experiment with technologies to process massive datasets and get insightful performance details for customers. One of our major initiatives over the upcoming quarters is to introduce streaming in our multi-billion-events-per-hour data ingestion layer and provide real-time metrics for our customers. Introducing streaming into this massive pipeline could easily span multiple quarters before reaping any benefit, if not properly planned. This blog covers our phased plan to introduce streaming in our system and highlights tracers we added to automatically test data consistency in the streaming pipeline.

Batch processing pipeline

The current log processing pipeline is…

by Yuva Mahendran

At Integral Ad Science, with billions of events hourly, milliseconds can make a difference in down-stream processing. Is Apache Pulsar ready to replace Kafka as our go to streaming data provider? We put it to the test.


Our main goal was to expose and make data available for down-stream processing within milliseconds from the actual event happening.

Candidates for experimenting

Apache Kafka is a framework that’s been in the market since 2011, and has stood the test in time in and outside IAS. Given that we have our core-pipes already running in AWS, MSK (Amazon managed Kafka) was a natural…

by Janus Chung

I love technology and I am lucky enough to get paid for pursuing this hobby. So it is no surprise that in my spare time at home I am experimenting with new technologies and leveraging those to make my life at home a bit more convenient. One of the latest projects I worked on at home was rebuilding my home server.

In this new socially distant reality, the tools I use at work (Docker and automation with Git and Jenkins) have helped me build a home server to unify and simplify entertainment, and connect with family and…

IAS Tech Blog

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store