Cocky, overconfident and stupid

Photo by Harrison Kugler on Unsplash

In 2017, I invested everything I had into Crypto. I didn’t have much. But I was all in. As you can probably guess from the title, things didn’t go so well.

For the first couple of months I was ecstatic. Prices kept going up. I was making more money in 24 hours that I had ever made in a regular job.

I began watching crypto youtubers “explain” how various coins worked. I bought into the hype completely.

I told my friends to buy Crypto. “You don’t want to tell your grandkids you missed out, do you?”

I’d go onto coinmarketcap…


Ethereum One Way Hashing functions explained

Image by author. Quote from here.

Can you represent 8,018,009 as a product of two prime numbers?

You can use whatever calculator or program you like. I’ll wait.

****

I couldn’t do it either. Nor can a computer.

This is the guiding principle behind Crypto Maths.

In this post we’ll go from a private key to an address using all the mathematical functions in between. Much of this comes from Chapter 4 of the ethereumbook.

Public Keys

You’ve heard many definitions of a Public Key. But here’s the real one:

“An Ethereum public key is a point on an elliptic curve, meaning it is a set of x…


Applying Neural Networks to the Meal Kit Industry

Photo by Lily Banse on Unsplash

So this is going to overfit.

Time series problems usually struggle with overfitting. This entire exercise became more of a challenge to see how I could prevent overfitting in time series forecasting.

I added weight decay and dropout. This should work to prevent overfitting. The network has embedding layers for categorical variables (which I vary in size) followed by dropout and batch normalisation (for continuous variables).

According to this article ideally, you want lower amounts of dropout and larger amounts of weight decay.

Dataset

The data is given by a meal kit company. …


A walkthrough of Data Transformations in PySpark

Image by Markus Spiske from Pexels

Data is now growing faster than processing speeds. One of the many solutions to this problem is to parallelise our computing on large clusters. Enter PySpark.

However, PySpark requires you to think about data differently.

Instead of looking at a dataset row-wise. PySpark encourages you to look at it column-wise. This was a difficult transition for me at first. I’ll tell you the main tricks I learned so you don’t have to waste your time searching for the answers.

Dataset

I’ll be using the Hazardous Air Pollutants dataset from Kaggle.

This Dataset is 8,097,069 rows.

df = spark.read.csv(‘epa_hap_daily_summary.csv’,inferSchema=True, header =True)
df.show()


8 Data Science Algorithms Explained Visually

Interviewer: “So how does Random Forests work?”

Me: “Umm…well… It’s kind of like a decision tree…and…um”

Interviewer: “How does Gradient Descent work?”

Me: “So…if you look at the equation…um…it’s kind of like…umm”

This was me during a real data science interview. As you can probably imagine I didn’t get the job.

How could I fail so badly? I knew the maths. I understood the material. I could code this up in python.

The problem was: I couldn’t communicate my understanding.

My mathematics courses and programming courses taught me how to code and how to think about data modelling. …


Photo by Jess Bailey on Unsplash

Simple Hacks to reduce the Gas you pay for Smart Contracts

I went from paying excessive amounts in gas to paying a reasonable amount after doing a course on solidity development.

I’ll tell you the main tricks here so you don’t waste your time doing the entire course.

1. Smaller uints

If you’ve got multiple uints inside a struct use a small-sized uint. This allows Solidity to use less storage.

Convert this:


struct NormalStruct {
uint a;
uint b;
uint c;
}

To this:

struct MiniMe {
uint32 a;
uint32 b;
uint c;
}

MiniMe will cost less gas than `NormalStruct` because of struct packing

2. View Functions Don’t Cost You a Thing

View functions don’t cost any gas when they’re called…


Paired Dataset for Image to Image Translation in Old Films

TLDR

I’ve created a dataset for training film restoration models.

The video above shows a sample. On the left is a video of a great star wars scene. On the right is the same video made crappier.

The extracted frames are available here: https://www.kaggle.com/spiyer/old-film-restoration-dataset/. You could use this to train a film restoration model (like I’ve been doing). Enjoy!

Why did I do this?

Properly cleaned data is not as abundant as people make it out to be.

I’ve been trying to restore the star wars deleted scenes for some time now. My attempts have been far from perfect.

Recently I thought that if I…


Use Icevision and Detectron2 to detect swimming pools from aerial imagery

Photo by CHUTTERSNAP on Unsplash

Talk is cheap. Show me the code. — Linus Torvalds

There’s a lot of talk about swimming pool detection from aerial imagery.

You’re probably interested in a code first example. I was too. But I couldn’t find one.

I decided to make my own.

It’s not perfect. It’s not pretty. But it seems to work.

All code is on Github. Criticism is appreciated.

Dataset

To make this you’ll need data. Lots of labelled training data. This can be tricky to obtain. Particularly when your budget is as low as mine ($0).

I managed to find a government resource that gives you…


Easily Segment Canopy Cover and Soil using NDVI and Rasterio

Image by Author

In this post we’ll be trying to segment canopy cover and soil in satellite imagery.

We’ll be borrowing ideas from this paper. I’ll also be using ideas from my previous blog post on this topic.

Ideally we want to go from a regular satellite image:


Hands-on Tutorials

Use KMeans clustering to segment satellite imagery by land cover/land use

Image by Author

Recently, I applied KMeans clustering to Satellite Imagery and was impressed by the results. I’ll tell you the tricks I learned so you don’t waste your time.

Things to note:

  • Use rasterio not gdal. Rasterio is more pythonic.
  • For this example I’ll be using Terravion imagery. This gives high resolution low level satellite imagery. The Terravion imagery comes in 8 different bands.
  • I’ll have 3 clusters. These will include: Canopy cover (trees, vegetation, etc. ), Soil and Background.

KMeans Explanation

I made an infographic to explain KMeans in simple english. Check it out on reddit or twitter.

Stack Bands

Each Terravion image has the…

Neel Iyer

Data Scientist at Swiss Reinsurance. Linkedin: https://www.linkedin.com/in/neel-iyer/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store