Press "Enter" to skip to content

GOTO 2016 • Fixing the Image Problems of the Web using Machine Learning • Chris Heilmann


00:00:08hello so as said if you locked if you

00:00:11looked for the other talk bad luck it’s

00:00:13me I’m also gonna be here later on

00:00:16talking about progressive Web Apps but

00:00:18I’d filled in and the organizers were

00:00:21nice enough to give me alcohol and use

00:00:22words and things and it was very

00:00:24effective so I started saying okay I can

00:00:26give another talk I’m actually going to

00:00:28a pixel camp after that in BA in

00:00:30Portugal and give something similar

00:00:32there so it’s good thing to try it out

00:00:33on a real audience so I want to talk

00:00:36today about the image problem of the web

00:00:37and not the image of the web but images

00:00:40on the web and how we can solve these

00:00:42issues with machine learning and also

00:00:45with just tooling that we think about

00:00:47coast as far too many things on the web

00:00:49that we’re not optimizing right now and

00:00:51the problem is that we’re sitting on

00:00:53these wonderful machines with fat

00:00:55connections and we put things on the web

00:00:57and we think it’s beautiful and our end

00:00:59users basically stand there on a mobile

00:01:00phone with a spinning thing and see

00:01:02their their data data traffic in the

00:01:04background going there and a bank

00:01:05account going down and they wonder

00:01:07what’s going on so I’m Chris Hillman a

00:01:10code poet on Twitter in case you need

00:01:12more pictures of kittens and hedgehogs

00:01:13and some JavaScript stuff as well I’m

00:01:16they’re pretty active there this is me

00:01:18looking at mobile phones and being

00:01:19annoyed about the state of the web on

00:01:21mobile phones a few years ago and

00:01:22luckily enough to got a much better I

00:01:24work for Microsoft on open source things

00:01:27like the JavaScript engine Visual Studio

00:01:30code and also a lipid in a machine

00:01:32learning but I’m more of a fanboy of the

00:01:34machine learning team and help them to

00:01:36talk to humans and humans to machine

00:01:38learning people it’s a bit like

00:01:40translating for programmers to other

00:01:42people which is another skill that you

00:01:43should have let’s go back in time a bit

00:01:47we all know this character right if you

00:01:50know I’m in green then you probably have

00:01:51an older sibling because you were never

00:01:54allowed to play Mario you always have to

00:01:55play Luigi and what is funny about this

00:01:58character and I used to do games on

00:02:00Commodore 64 and gameboy color and an

00:02:03Amiga later on is we always think like

00:02:05this this limitation is because this is

00:02:08how it is but there is sense and sense

00:02:11and meaning in all of the things that is

00:02:12about Mario first of all the red and

00:02:16blue offered the best contrast to the

00:02:19skin

00:02:19in the game background so it’s that they

00:02:22didn’t put them together because they

00:02:23were the colors that were left over they

00:02:25put them in there so you can see the

00:02:26character all the time nowadays when

00:02:29people go for nostalgia of pixels they

00:02:31don’t understand that back then you

00:02:33never saw a pixel we jumped over

00:02:34backwards to make sure you’re doing see

00:02:36pixels because it was bad TV sets they

00:02:38smooshed everything together so you got

00:02:40to make sure that you have contrast

00:02:42between the different colors to make

00:02:44sure that your game character is visible

00:02:46the cap meant there was no need to worry

00:02:48about hair style eyebrows or a forehead

00:02:50actually originally Mario was not

00:02:53supposed to be a block a plumber they

00:02:54just needed to make him a plumber

00:02:56because they didn’t have enough pixels

00:02:57for the hair when he fell down a hole so

00:02:59they just gave him a cap and then like

00:03:01okay who has caps plumbers okay cool we

00:03:03got these pipe sprites as well as we

00:03:05might as well use those and the large

00:03:08nose and the mustache made it possible

00:03:10to avoid a mouth and facial expressions

00:03:11because you just didn’t have enough

00:03:13pixels for different facial expressions

00:03:15and this is what the mario was built for

00:03:17him what it was designed for and it was

00:03:20great because it was designed by

00:03:21limitations I always loved being

00:03:23creative in environments that are

00:03:25limited and this is over and back then

00:03:28we fought for every pixel fought for

00:03:30every piece of information for every

00:03:32byte but this is over because nowadays

00:03:34we got these massive machines fast

00:03:36connections we’ve got quad-core

00:03:38computers in our in our pockets and we

00:03:40just don’t understand that people on the

00:03:42other side of the planet might not be

00:03:43able to see that so everything has

00:03:46reasons and meaning in that design that

00:03:48we did back then whereas nowadays

00:03:50evolution is happening around us we’re

00:03:52moving away from desktop machines to

00:03:54laptops and now actually to mobile

00:03:56phones the next million users of the web

00:03:58will not be on any desktop or laptop day

00:04:01will be on mobile devices and the reason

00:04:03is infrastructure in countries where

00:04:05there is growth on the internet like

00:04:07Africa Indonesia Bangladesh India people

00:04:12don’t have any flats where a computer

00:04:14could be set up people can’t afford a

00:04:16MacBook Pro people can afford a mobile

00:04:18phone and a data connection there’s not

00:04:20even cables in the ground to get

00:04:21connectivity but there is mobile masks

00:04:24everywhere so everybody will be on

00:04:25mobile devices so that’s what we have to

00:04:27think about for the near future or

00:04:29actually right now this is where the

00:04:30next growth will be and the next growth

00:04:32after that

00:04:33not even have a UX anymore which will

00:04:35just be chatbots and systems that people

00:04:37can talk to so technology advanced and

00:04:40pixels are a side product of our

00:04:42interactions with the web

00:04:43most people don’t draw things on the web

00:04:46or make graphics or they just take

00:04:48pictures and upload them and they don’t

00:04:49even caption them they don’t even

00:04:51explain what the what the picture is

00:04:54they just let the pictures speak for

00:04:55itself which of course is incredibly

00:04:57depressing if you’re a blind person and

00:04:59you get these pictures without any

00:05:01alternative text you don’t know what’s

00:05:02going on or you like an old person like

00:05:04me and you try to understand what

00:05:06snapchat might be about you just like I

00:05:07have no plan anymore what’s going on

00:05:09here these are people sending selfies to

00:05:11each other for the last two hours what’s

00:05:12the meaning of this but okay I’m old

00:05:15fuck it

00:05:16the problem is that we take pictures and

00:05:19we upload pictures and unoptimized

00:05:21pictures to bigger the bigger the better

00:05:23I mean some of the phones the phone that

00:05:25I have is like a 20 megapixel camera in

00:05:27it this is like an 8 make photo that I

00:05:29just upload in the background cause my

00:05:30data plan in England it’s good enough to

00:05:32do it I don’t care about it and if you

00:05:34look at the average website and that is

00:05:36actually rather old this is probably

00:05:38bigger right now we can take a look at

00:05:39that later if you want to its 2.2 Meg

00:05:42for a website this is not a web or web

00:05:45page this is not the site that’s not the

00:05:47whole site is the first loading thing

00:05:48that people see on the screen and it’s

00:05:512.2 megabyte to say like hello and

00:05:53welcome to our website

00:05:54and this is the state that we’re in

00:05:56right now because we kept pushing things

00:05:58on the web you need these 12 libraries

00:06:00you need these 15 JavaScript frameworks

00:06:02and you upload images because they’re

00:06:04pretty and on iOS iPads in retina they

00:06:07need a needle II look great and

00:06:09everybody else should get the same

00:06:10picture so it’s 1.4 Meg of images in

00:06:14that on the average on the average web

00:06:16page out there I call this inspirational

00:06:18obesity we just put things in there

00:06:21because we see them pretty on our

00:06:22high-end devices but our end users don’t

00:06:24necessarily get them they’re just

00:06:25standing there and getting the loading

00:06:27spinner for five minutes which is not a

00:06:29good experience 1.4 megabyte of images

00:06:32mostly because of wrong file formats

00:06:33people save images as PNG s with alpha

00:06:36channel that don’t need any alpha

00:06:37channel and would be happily a jpeg or

00:06:39where P if you have a browser that

00:06:41supports web P or you have like text

00:06:44saved as JPEGs and it’s unreadable you

00:06:46know

00:06:47without all the artifacts on it it just

00:06:49drives me crazy that we just don’t

00:06:50understand which format to use for which

00:06:52image because most of time people who

00:06:54upload images are maintenance coders

00:06:56they’re not developers they’re not

00:06:57designers they’re people that just use a

00:06:59content management system drag and drop

00:07:01it in you know when you’ve been your

00:07:02freelancer and you ask a client for logo

00:07:04and you get a word document or an embed

00:07:06it for logo in it and you just want to

00:07:08go back to goat farming and just I don’t

00:07:10want to live anymore

00:07:12we’re delivering high scaled high-res

00:07:15images to everybody we take a six

00:07:17thousand pixel image and make it torn

00:07:18and pixel wide I’ve seen that so many

00:07:20times it’s quite fun as well and people

00:07:21wonder when browsers are slow no

00:07:24automatic conversion optimization steps

00:07:26we have all that technology we just

00:07:28don’t use it we just have an upload

00:07:29facility and even in WordPress tells you

00:07:32can only upload two megabytes so what

00:07:34does the good WordPress admin do turn

00:07:36that off and say like you can upload

00:07:37whatever you want and then people have

00:07:3920 megabyte images in their hero image

00:07:42instead of text content that’s very

00:07:43important web design lately it’s just

00:07:45this massive thing and I blame medium

00:07:48you know like instead of riding along

00:07:50article you got this massive image

00:07:51before you actually scroll you like what

00:07:53did you want to tell me this image is

00:07:55not you is it we need to change that to

00:07:58make the web fast again because a

00:08:00connectivity is our biggest new hurdle

00:08:02it’s like for us here on Wireless this

00:08:05section amazingly good for wireless for

00:08:06a conference but most of the time you’re

00:08:08in you’re under somewhere and your

00:08:10connectivity might be good but the next

00:08:12second it might be gone it might be a

00:08:14connectivity to a Wi-Fi connection that

00:08:16shows you oh I’m Wi-Fi and then you

00:08:18can’t connect it unless you give it your

00:08:20credit card details you firstborn some

00:08:22blood and like your your home address or

00:08:24something like that or sometimes you

00:08:26trust it can’t even connected although

00:08:27it gives you the full bars it’s called

00:08:29life I the web is much bigger than our

00:08:32little developer world and growth

00:08:34happens outside of it if you want to

00:08:36think about the next few years of the

00:08:37web and you want to keep your job

00:08:39think about those markets that you don’t

00:08:41think about right now because this is

00:08:42where growth happens everywhere else is

00:08:44on the decline people don’t actually

00:08:46download new apps people don’t use the

00:08:48web as much as they used to do the big

00:08:50winners are people that stay inside

00:08:52Google services inside Facebook services

00:08:54inside being no and inside Facebook

00:08:56services Google services and of course

00:08:58in in chat systems like what

00:09:01one of these solutions that Google and

00:09:04opera for example are really good at are

00:09:06cloud services and proxy browsers so

00:09:09those actually used for example a lot in

00:09:12Africa and India automatically strip

00:09:14down your images and automatically

00:09:16convert them to something really pixely

00:09:18and ugly because it’s better if somebody

00:09:20gets an ugly picture then no picture at

00:09:22all if you relied on a picture too for

00:09:24your content and they also stripped down

00:09:26your CSS and your JavaScript if your

00:09:28javascript takes longer than 1.2 seconds

00:09:30to run on an old android device then it

00:09:33actually takes out your javascript so if

00:09:35you relied on your javascript for your

00:09:36page to load you’re not lucky there

00:09:38people will not see anything but a few

00:09:41things we can do there’s a few things we

00:09:43can do instead of relying on these proxy

00:09:45browsers and hoping that Google fixes

00:09:47everything for us so the problems with

00:09:49images are huge images for everybody and

00:09:51optimized images no alternative content

00:09:54no training or incentive to add content

00:09:56in content management systems and here’s

00:09:59our Arsenal to fix that I’m doing a bit

00:10:01faster because he said I have to do Q&A

00:10:02and stuff so but you’re clever so it’s

00:10:05all good better browsers with responsive

00:10:08image support are here right now and we

00:10:10don’t have to worry about these older

00:10:12browsers anymore automated lossless

00:10:14image optimization tools file level

00:10:16access to images to extract metadata

00:10:19scripting solutions to offer alternative

00:10:21content and cloud services machine

00:10:23learning API for intelligent resizing

00:10:25and I’m going to go through them bit by

00:10:27bit machine learning for tagging as well

00:10:29so browsers with responsive image

00:10:31support responsive design it should not

00:10:34be an unknown to people anymore it’s

00:10:36just a sensible thing to do because I

00:10:38have this and I look at it like that I

00:10:40turn it like that then I switch to this

00:10:41one and I switch to my Xbox to my fridge

00:10:43to my dog to my cat wherever the

00:10:45internet runs on nowadays there is no

00:10:48screen size any longer there’s no thing

00:10:50oh we need thousand 24 pixels it’s like

00:10:52water you put it anywhere in it will

00:10:54fill as much as it can and it’s fine so

00:10:57media queries we’re the first idea that

00:10:59we had with that in CSS and also in

00:11:01JavaScript with match media

00:11:03the problem is with media queries

00:11:04degrees so if you actually have a CSS

00:11:07file with your large images in it your

00:11:08mid images and your small images the

00:11:10browser loads all of them and only shows

00:11:12the one that is appropriate but

00:11:14data in the background still gets used

00:11:15and if you’re on a metered data plan

00:11:17that’s a bad idea

00:11:18that’s why we invented the picture

00:11:21element and source set sauce that came

00:11:23from Apple picture element came from

00:11:24people who just looked at the video

00:11:26element and said like why don’t we have

00:11:27a picture element because in this one

00:11:29you define the image in several several

00:11:32formats and several sizes and the

00:11:34browser only loads the one that actually

00:11:36applies and doesn’t touch the other ones

00:11:38so that way you have no problem of all

00:11:40the images being loaded the support is

00:11:42great this is again outdated is fit

00:11:44updated now I think Safari is is now in

00:11:47the newest version can I use dot-com is

00:11:49always your friend if you want to try

00:11:50something new out type it in there be

00:11:53happy and start using it don’t don’t

00:11:55complain that all browsers might die

00:11:56because they have to die great

00:11:58information on Jake Archibald block the

00:12:00anatomy of responsive images where he

00:12:02explains what all these source said

00:12:04shortcuts mean and what all the

00:12:05information is about but in essence most

00:12:08of the systems out there already used at

00:12:10WordPress now uses it out of the box if

00:12:12you just put an image in there it does

00:12:13the picture element for you there’s also

00:12:15a great live demo on didi on our windows

00:12:18developer site and that one shows you in

00:12:21a real world scenario what that looks

00:12:22like so that painting has been painted

00:12:24by one of our one of our colleagues by

00:12:26his wife and shows you like then it only

00:12:28loads the image that is necessary for

00:12:30that size in the right format instead of

00:12:33downloading lots and lots of data in the

00:12:35background and then resizing it

00:12:36accordingly and you can you can play

00:12:38with that quite nicely to do or have a

00:12:40proper text to image ratio as well now

00:12:45you have automated tools for lossless

00:12:46image optimization that’s very important

00:12:49lost the image optimization you make

00:12:51your designers really unhappy because

00:12:52you put like artifacts in there or get

00:12:55it down to like 12 colors instead of

00:12:56like 256 it’s not fun to do that

00:12:59lossless optimization a lot of times

00:13:00it’s a packing algorithm that doesn’t

00:13:03change to look and feel but just goes on

00:13:04the byte level of the image and strips

00:13:06out the bytes they’re not necessary and

00:13:08not needed cause content Photoshop and

00:13:11other image editors put a lot of data

00:13:13into the file itself that you don’t need

00:13:15imageoptim is the big one there if you

00:13:18don’t use that yet please use it it’s

00:13:19also available as an NPM module you can

00:13:21also put it in your node solutions and

00:13:23that one allows you just to drag images

00:13:25into it and it automatically Optima

00:13:27the images according to what it is so a

00:13:30gif gets optimized with one optimizer

00:13:32JPEG with another a BMP if you use it on

00:13:35the web I come and hurt you and when

00:13:37other images tip for example get also

00:13:39downsampled to a format that makes much

00:13:41more sense and it’s as easy as that just

00:13:43drag it in there and it’s it get it’s

00:13:45replacing the original image it’s not

00:13:47making a new image that you don’t have

00:13:49to copy over or something it just

00:13:51changes all the things in there that are

00:13:52not needed and in this case for example

00:13:54we got 44 44% win on a JPEG and this is

00:13:59as simple as it is before you put your

00:14:01images on the web run it through a

00:14:02system like that and everybody wins

00:14:05now we have file level access to

00:14:08information and images we always had

00:14:10that in things like image magic or GD

00:14:13library in PHP but now we have it in

00:14:15JavaScript as well we can use the exif

00:14:17data in the image itself when you

00:14:19right-click something in Windows and

00:14:20shows you the exif data where it’s done

00:14:22you can access that in JavaScript and do

00:14:24cool stuff with that as well for example

00:14:27instead of rotating a JPEG in the

00:14:29browser you can actually read the header

00:14:31and then it tells you what the rotation

00:14:33of the image is so the so you already

00:14:35know that it’s gonna be displayed the

00:14:37right way before you turn it out and

00:14:402012 when I used to work for Yahoo we

00:14:43already played with that we already work

00:14:44with that in Flickr and it was just

00:14:47amazing that we haven’t thought of this

00:14:49in between because what Flickr for

00:14:51example did when I uploaded images it’s

00:14:54a pretty cool thing you dragged it up

00:14:55there and you show it in a browser and

00:14:58the photos immediately show up these are

00:15:00like all 8 megabyte pictures and it was

00:15:02not a fast connection so the photos show

00:15:04up quickly and then they stop start

00:15:07uploading in the background so if we

00:15:08take a look at this zoomed in you can

00:15:11see that the image shows and then you

00:15:12got the little circle thing there

00:15:14uploading the image in the background

00:15:16and this is using the exif data in the

00:15:18jpg itself every every jpg has a little

00:15:21thumbnail in its in its in its file so

00:15:25you can read the first 50 bytes and then

00:15:27display that as a thumbnail and then

00:15:29load the rest of it so instead of

00:15:31loading it and then have an unload

00:15:33handler you load it as a file reader as

00:15:35a stream and display it while you’re

00:15:37actually showing it and that is a great

00:15:38way to give an interface to your end

00:15:40user

00:15:41that looks much more interactive than

00:15:43just please wait

00:15:44of course there’s exif data in your

00:15:46pages as well if you don’t want to give

00:15:48out I created at remove photo datacom

00:15:51which works on a mobile phone works

00:15:52offline doesn’t have any server at all

00:15:54just works in JavaScript in your browser

00:15:56where you can drag an image in and gets

00:15:58all it gets rid of all the exif data and

00:16:00gives you the image for downloading so

00:16:01in case you don’t want for example your

00:16:03geolocation in your image or you don’t

00:16:05want to people to know which camera it

00:16:07was taken with it’s probably a good idea

00:16:09to do these kind of things

00:16:10the geolocation is also visible in most

00:16:13of the JPEGs that you do nowadays with

00:16:14cameras and I can tell you where the

00:16:17picture has been taken and that has been

00:16:19the downfall for a few people that want

00:16:21to harass other people with with

00:16:23pictures of parts of their anatomy and

00:16:25then they actually found them because

00:16:27they realized where they lift which is

00:16:28good but you should actually make sure

00:16:30that if you don’t want to give that data

00:16:32out there be sure that it’s actually not

00:16:34in your JPEG file anymore so a good

00:16:38thing about an interface with images is

00:16:40to provide fallback content so instead

00:16:43of just waiting for the image to load

00:16:45you could for example give a colored

00:16:47background that is part of the image and

00:16:50then gets replaced when the image has

00:16:52been loaded a lot of a lot of systems

00:16:55use that nowadays already the blur up

00:16:58technique is a big one

00:16:59you can see this one for example here

00:17:01let me start that again where you you

00:17:03see the image being blurry and then

00:17:05becoming becoming sorted this is on

00:17:07medium medium uses that for example and

00:17:10on medium this is the code for it and

00:17:12this is pretty much nuts you know it’s

00:17:16like a figure with tan those dips in

00:17:17there and a JavaScript and an image

00:17:19progressive media bla whatever I don’t

00:17:21know what’s going on there and which is

00:17:25it looks good but this is I don’t know

00:17:27why they do it that way because there is

00:17:28a CSS technique to do the same thing so

00:17:31what you do is you take this much

00:17:32smaller image of that one like the

00:17:34thumbnail that is embedded in the JPEG

00:17:35and you scale it up in CSS with a

00:17:38hundred percent of the width the auto

00:17:40width of the container and set of CSS

00:17:43blur filter on it or an SVG blur filter

00:17:45over it and then when the full image has

00:17:47been loaded you just turn off the filter

00:17:49and you get rid of the small thumbnail

00:17:51image and that way you get the same

00:17:52effect without having to jump through

00:17:54groups of

00:17:5410,000 lines of JavaScript but it looks

00:17:57good it gives it gives the impression

00:17:59that something is happening and you

00:18:00cannot do you cannot do nothing worse

00:18:02than making people just wait

00:18:04people don’t like waiting and especially

00:18:05not on a mobile device so this is a

00:18:07great way of making that work you can

00:18:11also count pixels in in canvas I have

00:18:14full access to every image in the

00:18:16browser nowadays I can’t have access to

00:18:18an image on another domain because

00:18:20there’s a security problem in there but

00:18:22if I drag and drop an image for example

00:18:23into the browser or I have the image

00:18:26already on the same domain

00:18:28I get level I could pixel level access

00:18:30to the to the image itself so if I do if

00:18:34I put it in the canvas and read out the

00:18:36canvas State or the canvas data is a an

00:18:38object with the width and the height and

00:18:40then it’s an array of four elements

00:18:43which is like the the RGB and the a

00:18:45value of each of the pixels so for

00:18:47example in this case here I have this

00:18:49little c64 text thing and I just count

00:18:52the pixels and tell it as ten thousand

00:18:54four hundred seventy two black ones so

00:18:55that’s probably the main at the main

00:18:57color that I want to use here and you

00:18:59can use that too to determine which are

00:19:01the colors that are there but it does

00:19:03better ways of doing it but this is a

00:19:04nice way or simple way of doing it and

00:19:06this is the code so just note that down

00:19:09quickly now the slides are available

00:19:11later on as well there’s lots of tools

00:19:13that use these kind of functionality as

00:19:15well there’s color if ideas which users

00:19:18be uses the gradients as a background

00:19:20and find out the right color for you and

00:19:23it has a lazy reveal as well so you can

00:19:25load them and fade it in from the image

00:19:28to from the color to the image and so on

00:19:30and so forth and you got color thief

00:19:31which is really really cute as that one

00:19:34allows you to like for example click

00:19:36this is a demo here so it clicks on the

00:19:37image it finds the dominant color and it

00:19:39finds the palette of it as well so this

00:19:41is cool to basically have an image and

00:19:43show CSS stuff around it that it’s the

00:19:46right palette and the right kind of

00:19:48color according to that image and that’s

00:19:50again a JavaScript library you can use

00:19:52for that now let’s go to the

00:19:55nitty-gritty of like what we can do with

00:19:57computers nowadays about images and this

00:19:59is where I’m getting very excited what

00:20:01we can do which is for example

00:20:03intelligent image resizing so to have a

00:20:05thumbnail of that image would normally

00:20:08be like let’s take

00:20:09that massive image and make it 150

00:20:11pixels wide and you have like a few

00:20:12pixels on the left that might be a woman

00:20:14and lots of blue pixels on the right

00:20:16that we don’t need so instead what we do

00:20:19is we detect okay where this would be

00:20:22the normal way to cut out 150 150 in

00:20:24there it’s nice but it’s not good enough

00:20:26this one is much better because what we

00:20:28did is we detect at the face of the lady

00:20:30and then we actually centered it on the

00:20:33paint on the thing and cut the rest out

00:20:34and this one is the best because we

00:20:37detect it in the image the outline of

00:20:39that person and then actually cut only

00:20:42that one out so this is something that

00:20:44you do by hand in Photoshop or something

00:20:46but machines can do quite nicely

00:20:47nowadays as well and it makes much more

00:20:50sense to have something like that

00:20:51displayed in your website then something

00:20:53that that it’s just a blurry mess and

00:20:56you don’t know what’s going on there and

00:20:57you don’t want to click on every

00:20:58thumbnail and please never ever resize

00:21:01an image to become a thumbnail the idea

00:21:03of a thumbnail is it’s a preview of the

00:21:05image both in file size and in size not

00:21:08only in size I see so many people

00:21:09downloading 550 megabyte pictures and

00:21:12show them as 100 per hundred and when

00:21:13you click on them look it’s really fast

00:21:14because it’s already loaded yeah 20

00:21:17megabyte are downloading downloading are

00:21:19only one of watch one of them there’s a

00:21:21JavaScript libraries called smart crop

00:21:23chairs that explains you how to do these

00:21:25things it’s kind of heavy on the machine

00:21:27so on a desktop machine fine on a mobile

00:21:30phone please don’t run this kind of

00:21:31stuff because it’s not meant for doing

00:21:33that and you don’t want to fry eggs on

00:21:35your back of your phone just to have a

00:21:36few have a nice thumbnail so you see in

00:21:39this case you do you see in in this case

00:21:43it found the outlines of the man and

00:21:45then the crop around it and that way it

00:21:48found the right size so it it it

00:21:51determines what the outlines are and

00:21:53depending on how close they are to each

00:21:54color and to each other it realized this

00:21:56is the most important part of that image

00:21:58there is a company called cloud in Airy

00:22:01that are using our systems under the

00:22:03hood and a few others as well in Israel

00:22:05there they’re really really adamant

00:22:07right now to tell you about their stuff

00:22:09but they’re really lovely people I

00:22:10wasn’t Israel a few days ago and I

00:22:12talked to them and what they do is to

00:22:14give you your a URL API like a REST API

00:22:17so you can say rest are narrator calm

00:22:19and then you have your image the

00:22:21uploaded image and then you

00:22:22say okay give it me give me a 16 by 9

00:22:24ratio make it 640 pixels wide off the on

00:22:28the phone JPEG so this one now realizes

00:22:31okay it’s sixteen by nine it cropped it

00:22:33to sixteen by nine and it made it 640

00:22:36pixels wide of the image that you

00:22:37uploaded this is kind of cryptic but

00:22:41they’re actually making it much more

00:22:42easier for you by having a proper SDK

00:22:45and as you can see almost every language

00:22:46out there Ruby PHP Python or J’s Java

00:22:49whatever and that one allows for

00:22:52intelligent resizing of images so when

00:22:54you now resize the browser it gives you

00:22:57the image that uses the best space and

00:22:59if you’re on the right-hand side you can

00:23:00see that the image images show more and

00:23:03less but keep the people in the middle

00:23:05of it because they Center on the face so

00:23:07that way you can automatically art

00:23:08direct your images without having to

00:23:10crop them by hand because the machine

00:23:12learning algorithm does that for you and

00:23:14understands that for you image X is

00:23:17another service that actually does that

00:23:19and they are even getting better they’re

00:23:21not only using facial detection which is

00:23:23like eye nose mouth but they’re also

00:23:25doing doing a high contrast version of

00:23:28your image and that way find out the

00:23:30most important parts so the same here

00:23:32they’re doing the outlines and the high

00:23:34image and then crop the rest that

00:23:35doesn’t have enough contrast and that

00:23:37also works that way so what about

00:23:40information that isn’t in the image this

00:23:42is this is basically what you can do

00:23:44with the image itself but what if we

00:23:46want to know that this is a coffee mark

00:23:47and this is like or this was like the

00:23:50current President of the United States

00:23:51in their image and we don’t want to have

00:23:54to know that machine learning and

00:23:57artificial intelligence to the rescue

00:23:58robots and computers are there to plow

00:24:00through data and data and oodles and

00:24:03oodles of data without getting bored and

00:24:04this is the good thing about computers

00:24:06this is what we should be using them for

00:24:07so Facebook has for example and I

00:24:11automated a charge of text this is a

00:24:13photo a friend of mine uploaded and it

00:24:15says image may contained so there there

00:24:17you see that it’s actually automated

00:24:18you’ve generated dog alt or an outdoor

00:24:21nature and you can see it in the

00:24:22alternative text on the image itself

00:24:24here as well and if you develop any

00:24:26developer tools here in this case

00:24:29Firefox how do they know that do they

00:24:33have like people in the basement chained

00:24:35to a desk that actually have to type

00:24:36things in maybe I don’t know but I think

00:24:39most of the time they use computers for

00:24:41that it’s not Mechanical Turk anymore

00:24:42that used to be the thing in Amazon to

00:24:44do these kind of things so there’s a

00:24:45great blog post on the Facebook code

00:24:48blog that explains how they’ve been

00:24:50doing that for years and years like all

00:24:52the images that are in Facebook have

00:24:53been analyzed have been classified have

00:24:56been detected and have been segregated

00:24:58or segmented into different sections so

00:25:01you’re say you say like okay I’ve got

00:25:03sheep

00:25:04I’ve got dog I’ve got man and then you

00:25:07find all the Sheep the dog and a man and

00:25:09you segment them out and that way you

00:25:12have it in the database if something

00:25:13looks a bit like that it’s probably a

00:25:15sheep from behind and they do that with

00:25:16all kind of data that they have on

00:25:18Facebook images already and now finally

00:25:21they gave us access to that one as well

00:25:22in a programmatic level that we can use

00:25:24that for our implementations as well so

00:25:27it’s not that they just are evil and

00:25:29find our data they’re giving it out as

00:25:31well which is pretty good and Google

00:25:34you’ve been doing that on google photos

00:25:35for quite a while as well I showed that

00:25:36the other day in Germany like my photos

00:25:39I don’t type any German I don’t I only

00:25:41type English but for example you can

00:25:42click on selfies in google photos and

00:25:45automatically finds the pictures that

00:25:47are selfies without you ever having to

00:25:49type in that this was a selfie so this

00:25:51was a smashing come front another

00:25:53conference and it’s basically me what I

00:25:55had talking these kind of things and it

00:25:57also finds locations for you so I say

00:25:59for example Tel Aviv and it doesn’t even

00:26:01it doesn’t only use the JPEG data of Tel

00:26:04Aviv but this is for example Heathrow on

00:26:05my way on the flight to Tel Aviv so I

00:26:07don’t really know how they did that but

00:26:09it is the right photo and the pictures

00:26:11of these emojis and these kind of things

00:26:13are all done in Tel Aviv as well I can

00:26:15then say a hunt which is dog in German

00:26:17I’ve never entered that ever but I have

00:26:20uploaded pictures of dogs and cat for

00:26:22qotsa for cat it detects my family’s dog

00:26:25as a cat which is true because he

00:26:27behaves like one but I don’t know as

00:26:30some sometimes it’s not that’s good I

00:26:32said things is but it’s pretty amazing

00:26:33that you have all these cool information

00:26:35in there without having to type it so

00:26:38the data behind that is from databases

00:26:41that have been used for years and years

00:26:43to classify and tag images there’s image

00:26:45net for example which is 14 million

00:26:48images right now

00:26:49and that one gives you a database to

00:26:51compare your images against and to find

00:26:53to find the right solutions to this is a

00:26:55cat this is a dog and so on and so forth

00:26:57good luck Google just last week released

00:27:00the open images data set and that’s over

00:27:029 million URLs of images that are

00:27:04attacked and classified for you so you

00:27:06can use that CSV it’s on github and with

00:27:09the metadata that you can download and

00:27:11run it against your own learning

00:27:12services to understand what your images

00:27:15might have in them and this one has for

00:27:18example in this picture the balcony

00:27:20stairs facade iron and so on and it’s

00:27:23not like just like that’s a spoon and

00:27:24that’s and that’s a fork it has lots of

00:27:26information in there and is highly

00:27:28highly detailed that you could then for

00:27:30example run through a translation

00:27:31service to find like the Danish dog or

00:27:34the Danish well Danish dog is like those

00:27:36big ones but there’s different story

00:27:40they also have the they run these tag

00:27:44stand through a language compiler and we

00:27:46do that as well with a few of our

00:27:47services so image captioning it’s open

00:27:49sourced intensive flow and you can use

00:27:51that and what they use it for is mostly

00:27:54for their up for the google photos but

00:27:56also the upcoming allo a chat client

00:27:59that they’re doing so in this case they

00:28:01have like human captions from the

00:28:02training set which is like a man riding

00:28:04away from top of a surfboard and they

00:28:06automatically captured one is finding

00:28:08three different images from that so they

00:28:10take whole sentences from that data set

00:28:12rather than just having a tag saying

00:28:14surf port man wave which is not human

00:28:17readable and not beautiful and they also

00:28:20then detect two syntax detection on

00:28:22these things and find the nouns and find

00:28:24the attributes and mix and match them to

00:28:27make better captions for other things

00:28:28afterwards and they also find then take

00:28:32it together with the image data and for

00:28:34example instead of saying like as a

00:28:36train with a Union Jack on the side it

00:28:38says like it’s a blue and yellow train

00:28:39because it also detected again how many

00:28:42pixels are blue and how many are yellow

00:28:43and there’s two brown bears as well

00:28:46instead of just two bears and what they

00:28:50use it for is for a lower face if you

00:28:52upload an image and you get these

00:28:53automated tags that if you don’t want to

00:28:55type something in you can just type that

00:28:56on there which is pretty cool but I find

00:28:58it really bizarre isn’t that doesn’t it

00:29:00mean in the end that we as humans just

00:29:02become a transportation

00:29:03service from two bots to talk to each

00:29:05other because I’d rather have you type

00:29:07something in a mistyping and make it

00:29:09human than just give me like Oh friend

00:29:12robot answer kind of thing you know it’s

00:29:14like it’s it’s really odd but people

00:29:16seem to be too lazy to type it in so

00:29:18they want that fine we have something

00:29:21like that as well called captain bot

00:29:23which is using three of our services so

00:29:26all of these services are available

00:29:27Google’s tensorflow Facebook’s whatever

00:29:30it’s called Amazon has a few things in

00:29:32there our lexer systems or pure Alexa

00:29:34skills and we have the Microsoft

00:29:35cognitive services that you can play

00:29:37with where you get 5,000 hits a month

00:29:39and then you can pay for more later on

00:29:41so this one is an upload image you can

00:29:43try it out at caption bot on AI analyzes

00:29:46the image and says it’s I think it’s a

00:29:47young man jumping in the air on a

00:29:49skateboard and you see there we don’t

00:29:51have man skateboard young we basically

00:29:53have a whole sentence because we ran it

00:29:55through a language analysis tool in

00:29:58machine learning as well to give you a

00:29:59sentence at the end now detecting humans

00:30:02is a very important thing as well

00:30:05one of our services does that for you so

00:30:08it realizes this is a 28 year old men a

00:30:11man in water swimming and it also tells

00:30:14you if you scroll down and don’t have

00:30:15that animated this time and I don’t have

00:30:17a life here right now it also finds the

00:30:19colors for you and it realizes if it’s a

00:30:21racy photo or if it’s an adult content

00:30:23photo so before you upload then you can

00:30:25automatically do that the other service

00:30:28that we have is automatically detecting

00:30:30child pornography in case you have an

00:30:33open system that you want you allow

00:30:35anyone to upload anything you don’t want

00:30:37that to be abused by the most horrible

00:30:39people on the Internet you can do that

00:30:41you can run it through that service

00:30:42before and automatically flags and

00:30:44deletes images that have been already

00:30:46recognized as a totally illegal content

00:30:49and that way we protect both the the the

00:30:54people that had these pictures taken of

00:30:55them and you from prosecution because we

00:30:58actually find out who’s been uploading

00:30:59them for example the lady down here in

00:31:03the bikini will be flagged as racy but

00:31:05not as adult and this one will find out

00:31:07train and train station and all kind of

00:31:10things it’s a city city line so the

00:31:12images are there to find the information

00:31:14in but mostly once we detect the face we

00:31:17also care

00:31:17also guess the age and we also give you

00:31:19the gender once we have a face we also

00:31:22give it an ID in your data set so you

00:31:24can try that out for yourself for for

00:31:26example verification or logging systems

00:31:29or detecting if the same person is in

00:31:31two different images and then

00:31:32automatically clustering them into into

00:31:35different folders we also have emotion

00:31:38detection so we detect for example that

00:31:40the man here is is kind of happy but

00:31:44he’s also what else is there he’s a bit

00:31:48of fear nor the kid the kid has a bit of

00:31:50fear and it’s a bit of a bit of neutral

00:31:52and a bit of surprise so his mouth being

00:31:54open sadly enough I didn’t bring it with

00:31:56me normally and our booth I have this

00:31:58demo where you have to show all the

00:32:00different emotions and then you can win

00:32:01a prize which is pretty pointless to try

00:32:03in Finland but from time to time you

00:32:05it’s fun to see what computers think our

00:32:08different emotions are in our different

00:32:10states of emotions are so you can detect

00:32:12the faces you get the JPEG you get the

00:32:14chasing back and that is basically just

00:32:15a REST API then you can throw an image

00:32:18against and you find out the pupil left

00:32:19the pupil right and the age and what the

00:32:22pet pose is like which angle to put the

00:32:25faces on so when you do for example

00:32:27login you don’t just do it with one face

00:32:29you have to ask the person to change it

00:32:31so you can see it’s a 3d face and not

00:32:33somebody holding up a picture of you to

00:32:34log into your computer you can verify

00:32:37the face once we know it is that the

00:32:39same person no it’s obviously not the

00:32:41same person you can cluster them into

00:32:43different clusters automatically so

00:32:45these are men these are women these are

00:32:46in-betweens these are I don’t know today

00:32:48kind of things and the great thing is

00:32:50that putting these all together you can

00:32:52really empower people and there’s where

00:32:54I want to show you a quick video that a

00:32:56colleague of mine has done and it’s it’s

00:32:59pretty stunning so let’s just work that

00:33:01quickly together

00:33:14I’m Sachi shake I lost my sight when I

00:33:18was seven and shortly after that I went

00:33:21to a school for the blind

00:33:22and that’s where it was introduced to

00:33:26you talking computers and that really

00:33:27opened up a whole new world of

00:33:29opportunities I joined Microsoft ten

00:33:32years ago as a software engineer I love

00:33:35making things which improve people’s

00:33:37lives and one of the things I’ve always

00:33:40dreamt of since I was at university was

00:33:42this idea of something that could tell

00:33:45you at any moment what’s going on around

00:33:46you I think it’s a man jumping in the

00:33:51air doing a trick on a skateboard

00:33:55I teamed up with like-minded engineers

00:33:58to make an app which lets you know who

00:33:59and what is around you it’s based on top

00:34:02of the Microsoft intelligence api’s

00:34:05which makes it so much easy to make this

00:34:07kind of thing the app runs are on

00:34:09smartphones but also on the pivothead

00:34:11smart glasses when you’re talking to a

00:34:14bigger group sometimes you can talk and

00:34:17talk and there’s no response and you

00:34:19think is everyone listening really well

00:34:21or are they half asleep and II never

00:34:24there I see two faces 40 year-old man

00:34:29with a beard looking surprised 20

00:34:31year-old woman looking happy via can

00:34:34describe the general age and gender of

00:34:37the people around me and what the

00:34:38relations are which is incredible one of

00:34:42the things that’s most useful about the

00:34:44app is the ability to read our text okay

00:34:50thank you I can use the app on my phone

00:34:52to take a picture of the menu and it’s

00:34:55gonna guide me on how to take that

00:34:57correct photo move camera to the bottom

00:35:00right and away from the document and

00:35:02then they’ll recognize the text read me

00:35:04the headings I see a appetizers salads

00:35:07paninis

00:35:08pizzas pastures years ago this was

00:35:12science fiction I never thought it would

00:35:14be something that you could actually do

00:35:16but artificial intelligence is improving

00:35:18at an ever-faster rate and I’m really

00:35:21excited to see where we can take

00:35:23as engineers we’re always standing on

00:35:26the shoulders of giants building on top

00:35:28of what went before and in this case

00:35:29we’ve taken years of research from

00:35:31Microsoft research to pull this off I

00:35:33think it’s a young girl throwing an

00:35:35orange frisbee in the park for me it’s

00:35:37about taking that far-off dream and

00:35:39building it one step at a time I think

00:35:42this is just the beginning how cool is

00:35:47that I mean it just fascinates me that

00:35:49he’s like an engineer himself riding

00:35:52this and I wrote I sat next to a blind

00:35:54PHP engineer for years it was much

00:35:55faster than me coding and I was just

00:35:57confused about this but it’s just it’s

00:35:59so much insightful when you when you

00:36:01actually do that kind of that kind of

00:36:03attitude and I what I love about machine

00:36:05learning is that people with

00:36:06disabilities are these super humans to

00:36:08test against if we if it works for them

00:36:11then it works for us even better and

00:36:13that’s a really really cool way of I’ve

00:36:15been doing accessibility for years and

00:36:17years and trying to make people

00:36:18understand that disability is not the

00:36:19end of things but it actually is an

00:36:21opportunity for everybody and with

00:36:24inclusive design ideas and this kind of

00:36:26thinking we have a great opportunity to

00:36:28make things understandable for everybody

00:36:31who doesn’t for example see or isn’t

00:36:34able to understand it or take a picture

00:36:36of something if you’ve seen for example

00:36:37Google Translate on a phone as well the

00:36:40app you can just take a picture you can

00:36:42you can turn your camera on and see a

00:36:44street sign and it translates it life

00:36:47for you from the camera and I mean how

00:36:49friggin cool is that when you’re in

00:36:50Russia and you don’t know what the name

00:36:51of that street is you only have the

00:36:53English name for example and these

00:36:55things are all possible because we have

00:36:57these massive amounts of data and what I

00:36:59love about this example as well is that

00:37:01it’s all open data it’s like he just

00:37:04used the open api’s from Microsoft he

00:37:06didn’t have any internal access that

00:37:07gave him extra access or something

00:37:09because we didn’t build those sadly

00:37:11enough but he just built that for

00:37:14himself with systems that are open and

00:37:16we’re now gonna he’s not gonna release

00:37:18it on iphone I think first and I think

00:37:20this is pretty amazing when you compare

00:37:22it to the other services that are out

00:37:24there I was just at an accessibility

00:37:25conference where there was a commercial

00:37:27company showing the same thing oh we got

00:37:29a we got glasses that can detect people

00:37:31and tell you when they’re in the room

00:37:32and it was 4,500 euro for those glasses

00:37:35and that whole solution

00:37:37can run on any smartphone right now

00:37:40and you don’t need those extra classes

00:37:42to get the same functionality and that’s

00:37:44what I want you to think about when it

00:37:46comes to this machine learning in images

00:37:48stuff the api’s are out there there’s

00:37:50just trillions of photos that we already

00:37:52indexed for you so cross-reference your

00:37:55own data and make your images more

00:37:57accessible that way and that’s all I had

00:37:59for now so thank you very much