Skip to content
Swizec Teller - a geek with a hatswizec.com

Livecoding #31: Wherein we learn that datasets are hard and find 2 good papers

This is a Livecoding Recap – an almost-weekly post about interesting things discovered while livecoding ?. Always under 500 words and with pictures. You can follow my channel, here. New content almost every Sunday at 2pm PDT. There’s live chat, come say hai ?

There were things happening this weekend that left me glued to Twitter. Distracted and ineffective, I didn't get much done. It shows in the Livecoding session too.

I'm still deciding whether I want to write about the things that were happening. Maybe I should, maybe I shouldn't, maybe I have nothing useful to add. Who knows… ¯\(ツ)

Completely coincidentally, I wanted to use the Livecoding session to build an immigration dataviz. Something that would show the positive economic impact of immigration, and not just This Is How Many People Came. Numbers are more interesting when coupled with impact.

Did you know that 24,000,000 people immigrated to the US in 2015 alone? That's a bunch of people.

Immigration chord diagram
Immigration chord diagram

Curran Kelleher built a great chord diagram of the UN migrations dataset. And there's this cool chord diagram of flights in and out of United States that @espinielli built.

Flights chord diagram

I wanted something more. I wanted to build something that shows how many businesses are created by immigrants, how much money is pumped into the economy, and how many people were employed.

I failed. For now.

Those datasets are hard to find. I was able to find a dataset that shows the number of people self-employed, having jobs, or running a business based on race and ethnicity. It's called the Survey of Business Owners. It’s collected by the government and released every month, I think.

Then there's the US census current population survey which also promised to be useful.

But I was unable to put them together and build a comprehensive dataset that shows what I wanted. Or even mentions it.

Looks like the US government is much more concerned with tracking whether somebody is Black, Asian, Hispanic, White, or a veteran than it is whether they're an immigrant or not. I wonder why… ?

That said, I found two amazing studies talking about what I wanted to show.

This Immigrant Entrepreneurship 2016 paper from Harvard is the first. It's 68 pages, so I haven't read it yet.

We examine immigrant entrepreneurship and the survival and growth of immigrant-founded businesses over time relative to native-founded companies. Our work quantifies immigrant contributions to new firm creation in a wide variety of fields and using multiple definitions. While significant research effort has gone into understanding the economic impact of immigration into the United States, comprehensive data for quantifying immigrant entrepreneurship are difficult to assemble. We combine several restricted-access U.S. Census Bureau data sets to create a unique longitudinal data platform that covers 1992-2008 and many states. We describe differences in the types of businesses initially formed by immigrants and their medium-term growth patterns. We also consider the relationship of these outcomes to the immigrant's age at arrival to the United States.

Sounds perfect, doesn't it?

Except for the "combine several restricted-access". I can't do restricted access.

This Immigrant Entrepreneurs and Small Business Owners, and their Access to Financial Capital 2012 paper from SBA tells a similar tale. Their datasets constructed out of restricted-access materials.

Alas, that puts a stop to this project for now. But I'll email the paper authors to see if they're willing to share.

Did you enjoy this article?

Published on January 30th, 2017 in Livecoding, Technical

Learned something new?
Want to become a high value JavaScript expert?

Here's how it works 👇

Leave your email and I'll send you an Interactive Modern JavaScript Cheatsheet 📖right away. After that you'll get thoughtfully written emails every week about React, JavaScript, and your career. Lessons learned over my 20 years in the industry working with companies ranging from tiny startups to Fortune5 behemoths.

Start with an interactive cheatsheet 📖

Then get thoughtful letters 💌 on mindsets, tactics, and technical skills for your career.

"Man, love your simple writing! Yours is the only email I open from marketers and only blog that I give a fuck to read & scroll till the end. And wow always take away lessons with me. Inspiring! And very relatable. 👌"

~ Ashish Kumar

Join over 10,000 engineers just like you already improving their careers with my letters, workshops, courses, and talks. ✌️

Have a burning question that you think I can answer? I don't have all of the answers, but I have some! Hit me up on twitter or book a 30min ama for in-depth help.

Ready to Stop copy pasting D3 examples and create data visualizations of your own?  Learn how to build scalable dataviz components your whole team can understand with React for Data Visualization

Curious about Serverless and the modern backend? Check out Serverless Handbook, modern backend for the frontend engineer.

Ready to learn how it all fits together and build a modern webapp from scratch? Learn how to launch a webapp and make your first 💰 on the side with ServerlessReact.Dev

Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com

By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️