as part of my day job
as part of my day job I’ve analyzed hundreds of different code
I’ve analyzed hundreds of different code
I’ve analyzed hundreds of different code bases most of them open source but also
bases most of them open source but also
bases most of them open source but also a lot of preparatory and commercial code
a lot of preparatory and commercial code
a lot of preparatory and commercial code bases and today I want to share some of
bases and today I want to share some of
bases and today I want to share some of my observations with you because there
my observations with you because there
my observations with you because there are some patterns that I see recur over
are some patterns that I see recur over
are some patterns that I see recur over and over again and those patterns are
and over again and those patterns are
and over again and those patterns are independent of programming language or
independent of programming language or
independent of programming language or technology and I want to do this in a
technology and I want to do this in a
technology and I want to do this in a different way because I’m pretty
different way because I’m pretty
different way because I’m pretty convinced that we as an industry we know
convinced that we as an industry we know
convinced that we as an industry we know what good code looks like we know about
what good code looks like we know about
what good code looks like we know about the importance of naming testability
the importance of naming testability
the importance of naming testability cohesion and all that other stuff so
cohesion and all that other stuff so
cohesion and all that other stuff so today I want to go beyond that
today I want to go beyond that
today I want to go beyond that instead I’m going to start with data
instead I’m going to start with data
instead I’m going to start with data analysis because isn’t it fascinating
analysis because isn’t it fascinating
analysis because isn’t it fascinating the data analysis has become mainstream
the data analysis has become mainstream
the data analysis has become mainstream and the rise of machine learning has
and the rise of machine learning has
and the rise of machine learning has taught us how to find patterns in
taught us how to find patterns in
taught us how to find patterns in complex phenomenons yet we as developers
complex phenomenons yet we as developers
complex phenomenons yet we as developers have never taken those techniques and
have never taken those techniques and
have never taken those techniques and turn them on ourselves to understand our
turn them on ourselves to understand our
turn them on ourselves to understand our own behavior and how our code bases grow
own behavior and how our code bases grow
own behavior and how our code bases grow so that’s what I want to do today I want
so that’s what I want to do today I want
so that’s what I want to do today I want to show you how behavioral data about us
to show you how behavioral data about us
to show you how behavioral data about us as developers can help us make better
as developers can help us make better
as developers can help us make better decisions decisions that guide us
decisions decisions that guide us
decisions decisions that guide us towards maintainable code bases and the
towards maintainable code bases and the
towards maintainable code bases and the good news are that you all already have
good news are that you all already have
good news are that you all already have all the data that you need we’re just
all the data that you need we’re just
all the data that you need we’re just not used to think about it that way I’m
not used to think about it that way I’m
not used to think about it that way I’m talking about the version control
talking about the version control
talking about the version control systems our version control data is a
systems our version control data is a
systems our version control data is a perfect behavior log over we as
perfect behavior log over we as
perfect behavior log over we as developers have interacted with our code
developers have interacted with our code
developers have interacted with our code so let’s uncover the secrets of version
so let’s uncover the secrets of version
so let’s uncover the secrets of version control later my first observation is
control later my first observation is
control later my first observation is that maintainable code goes way beyond
that maintainable code goes way beyond
that maintainable code goes way beyond technology and we as developers we tend
technology and we as developers we tend
technology and we as developers we tend to be pretty good on a technical part of
to be pretty good on a technical part of
to be pretty good on a technical part of that but my experience is that we often
that but my experience is that we often
that but my experience is that we often miss the social implications of our
miss the social implications of our
miss the social implications of our designs let’s see what that is important
designs let’s see what that is important
designs let’s see what that is important this is what pretty much every system
this is what pretty much every system
this is what pretty much every system that
that
that worked with over the past 10 years has
worked with over the past 10 years has
worked with over the past 10 years has looked like we have a number of big
looked like we have a number of big
looked like we have a number of big subsystems usually free that we somehow
subsystems usually free that we somehow
subsystems usually free that we somehow integrate and let them exchange
integrate and let them exchange
integrate and let them exchange information and at one end we have a big
information and at one end we have a big
information and at one end we have a big database now each one of those
database now each one of those
database now each one of those subsystems can be fairly large they can
subsystems can be fairly large they can
subsystems can be fairly large they can consist of hundreds of thousands or even
consist of hundreds of thousands or even
consist of hundreds of thousands or even millions of code and the scale of that
millions of code and the scale of that
millions of code and the scale of that alone makes it really really hard to
alone makes it really really hard to
alone makes it really really hard to reason about it but this is not just
reason about it but this is not just
reason about it but this is not just about scale because today’s systems tend
about scale because today’s systems tend
about scale because today’s systems tend to be written in multiple different
to be written in multiple different
to be written in multiple different programming languages and of course they
programming languages and of course they
programming languages and of course they are developed by multiple programmers
are developed by multiple programmers
are developed by multiple programmers organized into multiple teams and that
organized into multiple teams and that
organized into multiple teams and that leaves everyone with their own view of
leaves everyone with their own view of
leaves everyone with their own view of how the system looks no one has a
how the system looks no one has a
how the system looks no one has a holistic picture but this isn’t really
holistic picture but this isn’t really
holistic picture but this isn’t really just about technology because systems
just about technology because systems
just about technology because systems that look like this tend to come with an
that look like this tend to come with an
that look like this tend to come with an organization that looks like that so our
organization that looks like that so our
organization that looks like that so our main challenge is to balance this
main challenge is to balance this
main challenge is to balance this technical and social organizational
technical and social organizational
technical and social organizational complexities this is actually a hard
complexities this is actually a hard
complexities this is actually a hard problem but it’s a problem that we can
problem but it’s a problem that we can
problem but it’s a problem that we can simplify somehow by understanding the
simplify somehow by understanding the
simplify somehow by understanding the next point here my next observation is
next point here my next observation is
next point here my next observation is that all code is equal but some code is
that all code is equal but some code is
that all code is equal but some code is more equal than others and to explain
more equal than others and to explain
more equal than others and to explain what I mean about that I want to share a
what I mean about that I want to share a
what I mean about that I want to share a little story to you this is something
little story to you this is something
little story to you this is something that happened to me a number of years
that happened to me a number of years
that happened to me a number of years ago at that time I was working on a
ago at that time I was working on a
ago at that time I was working on a large legacy code base and what we did
large legacy code base and what we did
large legacy code base and what we did was we bugged a tool that was able to
was we bugged a tool that was able to
was we bugged a tool that was able to measure a bunch of different complexity
measure a bunch of different complexity
measure a bunch of different complexity measures and produce something called a
measures and produce something called a
measures and produce something called a technical depth metric so we basically
technical depth metric so we basically
technical depth metric so we basically took that tool threw it at the code base
took that tool threw it at the code base
took that tool threw it at the code base and out came a prioritized list of every
and out came a prioritized list of every
and out came a prioritized list of every module in the system and each module had
module in the system and each module had
module in the system and each module had a technical depth number assigned to it
a technical depth number assigned to it
a technical depth number assigned to it the interesting thing here was that
the interesting thing here was that
the interesting thing here was that there was a clear number one candidate
there was a clear number one candidate
there was a clear number one candidate there was one piece of code that was way
there was one piece of code that was way
there was one piece of code that was way worse than the rest of it however when
worse than the rest of it however when
worse than the rest of it however when we start to investigate that piece of
we start to investigate that piece of
we start to investigate that piece of code it turned out that that code had
code it turned out that that code had
code it turned out that that code had been stable for three years we never
been stable for three years we never
been stable for three years we never needed to touch it it worked wonderful
needed to touch it it worked wonderful
needed to touch it it worked wonderful in production it just did its job so
in production it just did its job so
in production it just did its job so should we really spend time and effort
should we really spend time and effort
should we really spend time and effort improving that piece of code what can we
improving that piece of code what can we
improving that piece of code what can we expect to gain to me it’s just a big
expect to gain to me it’s just a big
expect to gain to me it’s just a big risk and besides we probably all have
risk and besides we probably all have
risk and besides we probably all have much more urgent matters to attend to
much more urgent matters to attend to
much more urgent matters to attend to and to find out those matters we need to
and to find out those matters we need to
and to find out those matters we need to do something differently we need to take
do something differently we need to take
do something differently we need to take on evolutionary look at our code here’s
on evolutionary look at our code here’s
on evolutionary look at our code here’s how it looks like please have a look at
how it looks like please have a look at
how it looks like please have a look at these different graphs all of them shows
these different graphs all of them shows
these different graphs all of them shows the same thing on the x-axis you have
the same thing on the x-axis you have
the same thing on the x-axis you have each file in the system and those files
each file in the system and those files
each file in the system and those files are sorted according to their change
are sorted according to their change
are sorted according to their change frequencies that is how many commits
frequencies that is how many commits
frequencies that is how many commits have we done that influence that
have we done that influence that
have we done that influence that particular file and the number of
particular file and the number of
particular file and the number of commits is what you see on the y-axis
commits is what you see on the y-axis
commits is what you see on the y-axis now the interesting thing here is that
now the interesting thing here is that
now the interesting thing here is that what you see here are free radically
what you see here are free radically
what you see here are free radically different code bases written in
different code bases written in
different code bases written in completely different programming
completely different programming
completely different programming languages and with completely different
languages and with completely different
languages and with completely different lifetimes yet they all show the same
lifetimes yet they all show the same
lifetimes yet they all show the same pattern you see a power-law distribution
pattern you see a power-law distribution
pattern you see a power-law distribution and this is something I’ve seen in every
and this is something I’ve seen in every
and this is something I’ve seen in every single code base that I’ve analyzed most
single code base that I’ve analyzed most
single code base that I’ve analyzed most but what this means to you is that in
but what this means to you is that in
but what this means to you is that in your typical system most of your work
your typical system most of your work
your typical system most of your work tends to be in a relatively small part
tends to be in a relatively small part
tends to be in a relatively small part of the code base and most of your code
of the code base and most of your code
of the code base and most of your code is in the long tail which means is
is in the long tail which means is
is in the long tail which means is rarely if ever touched and this is
rarely if ever touched and this is
rarely if ever touched and this is important because it gives us a tool to
important because it gives us a tool to
important because it gives us a tool to prioritize to focus on the code that
prioritize to focus on the code that
prioritize to focus on the code that really matters so now we’re able to
really matters so now we’re able to
really matters so now we’re able to narrow down the amount of code we need
narrow down the amount of code we need
narrow down the amount of code we need to inspect if we want to make it real
to inspect if we want to make it real
to inspect if we want to make it real improvement that matters in terms of
improvement that matters in terms of
improvement that matters in terms of productivity and this is a good starting
productivity and this is a good starting
productivity and this is a good starting point but we need to do even better and
point but we need to do even better and
point but we need to do even better and to explain how I need to take a little
to explain how I need to take a little
to explain how I need to take a little risk I’m going to put up a slide and I
risk I’m going to put up a slide and I
risk I’m going to put up a slide and I think this is the point in the
think this is the point in the
think this is the point in the presentation where you will all
presentation where you will all
presentation where you will all violently disagree with me but here we
violently disagree with me but here we
violently disagree with me but here we go
go
go when it comes to maintainable code bases
when it comes to maintainable code bases
when it comes to maintainable code bases complexity isn’t the problem
complexity isn’t the problem
complexity isn’t the problem we developers we are so conditioned to
we developers we are so conditioned to
we developers we are so conditioned to despise bad code we have learned that we
despise bad code we have learned that we
despise bad code we have learned that we need to reflect the bad code as we see
need to reflect the bad code as we see
need to reflect the bad code as we see it and we have learned that we need to
it and we have learned that we need to
it and we have learned that we need to keep all our code clean and nice and
keep all our code clean and nice and
keep all our code clean and nice and that that is the fastest way forward the
that that is the fastest way forward the
that that is the fastest way forward the problem I have with that it’s just not
problem I have with that it’s just not
problem I have with that it’s just not true and the graph on the previous page
true and the graph on the previous page
true and the graph on the previous page indicates why that may be the case but
indicates why that may be the case but
indicates why that may be the case but still we cannot discard complexity
still we cannot discard complexity
still we cannot discard complexity entirely we just need to find out when
entirely we just need to find out when
entirely we just need to find out when complexity matters and when it doesn’t
complexity matters and when it doesn’t
complexity matters and when it doesn’t and this is a problem I’ve been
and this is a problem I’ve been
and this is a problem I’ve been struggling with for years then five
struggling with for years then five
struggling with for years then five years ago I was working again on a
years ago I was working again on a
years ago I was working again on a fairly large system with a lot of people
fairly large system with a lot of people
fairly large system with a lot of people and at the same time I was enrolled into
and at the same time I was enrolled into
and at the same time I was enrolled into University where took a course in
University where took a course in
University where took a course in forensic psychology and I think that
forensic psychology and I think that
forensic psychology and I think that forensics in general has a really
forensics in general has a really
forensics in general has a really beautiful mindset that replies well to
beautiful mindset that replies well to
beautiful mindset that replies well to software development as well
software development as well
software development as well but there was one technique in
but there was one technique in
but there was one technique in particular that I really want to show
particular that I really want to show
particular that I really want to show you this is a technique called
you this is a technique called
you this is a technique called geographical offender profiling
geographical offender profiling is a
geographical offender profiling is a technique that we use to catch serial
technique that we use to catch serial
technique that we use to catch serial offenders and you see an example here of
offenders and you see an example here of
offenders and you see an example here of how it looks this is a map over a city a
how it looks this is a map over a city a
how it looks this is a map over a city a city that looks pretty much like London
city that looks pretty much like London
city that looks pretty much like London and you see those green dots each one of
and you see those green dots each one of
and you see those green dots each one of those dots represents a crime scene with
those dots represents a crime scene with
those dots represents a crime scene with 56 different crime scenes and we know
56 different crime scenes and we know
56 different crime scenes and we know that those 56 crimes are committed by
that those 56 crimes are committed by
that those 56 crimes are committed by the same offender how do we know it’s
the same offender how do we know it’s
the same offender how do we know it’s the same offender well perhaps we have
the same offender well perhaps we have
the same offender well perhaps we have the hard evidence like DNA or
the hard evidence like DNA or
the hard evidence like DNA or fingerprints or we have just found that
fingerprints or we have just found that
fingerprints or we have just found that it’s the same modus operandi the same
it’s the same modus operandi the same
it’s the same modus operandi the same method of operation used by the criminal
method of operation used by the criminal
method of operation used by the criminal at the different crime scenes now what
at the different crime scenes now what
at the different crime scenes now what do you do in geographical offender
do you do in geographical offender
do you do in geographical offender profiling is that you use the
profiling is that you use the
profiling is that you use the distribution of those crimes to
distribution of those crimes to
distribution of those crimes to calculate a probability surface and that
calculate a probability surface and that
calculate a probability surface and that probability surface is import on meth
probability surface is import on meth
probability surface is import on meth ethical weighting of the distribution of
ethical weighting of the distribution of
ethical weighting of the distribution of the crimes but we also weight that
the crimes but we also weight that
the crimes but we also weight that formula with our knowledge of human
formula with our knowledge of human
formula with our knowledge of human behavior and using this probability
behavior and using this probability
behavior and using this probability surface we are now able to predict the
surface we are now able to predict the
surface we are now able to predict the most probable home location of our
most probable home location of our
most probable home location of our offender and that’s there in the red
offender and that’s there in the red
offender and that’s there in the red area and that’s what you call a hot spot
area and that’s what you call a hot spot
area and that’s what you call a hot spot and according to the research there’s a
and according to the research there’s a
and according to the research there’s a 70% chance that our offender will have
70% chance that our offender will have
70% chance that our offender will have his home base there and the reason I
his home base there and the reason I
his home base there and the reason I think this applies well to software is
think this applies well to software is
think this applies well to software is because think about what we have done we
because think about what we have done we
because think about what we have done we have taken a potentially vast
have taken a potentially vast
have taken a potentially vast geographical area and narrow it down to
geographical area and narrow it down to
geographical area and narrow it down to much much smaller part a much smaller
much much smaller part a much smaller
much much smaller part a much smaller part what we can now focus a human
part what we can now focus a human
part what we can now focus a human expertise and still be pretty sure that
expertise and still be pretty sure that
expertise and still be pretty sure that we catch that offender so what if we
we catch that offender so what if we
we catch that offender so what if we could do this in software what if we
could do this in software what if we
could do this in software what if we could take those horrible million lines
could take those horrible million lines
could take those horrible million lines of code and narrow them down to a few
of code and narrow them down to a few
of code and narrow them down to a few hotspots and know that if we focus on
hotspots and know that if we focus on
hotspots and know that if we focus on improvement there we get a real effect
improvement there we get a real effect
improvement there we get a real effect let’s see how that may look in software
let’s see how that may look in software
let’s see how that may look in software here we go so this is a geographical
here we go so this is a geographical
here we go so this is a geographical offender profile of a fairly large
offender profile of a fairly large
offender profile of a fairly large system what you see there is
system what you see there is
system what you see there is approximately three hundred thousand
approximately three hundred thousand
approximately three hundred thousand lines of code and that data here is
lines of code and that data here is
lines of code and that data here is built up from a version control system
built up from a version control system
built up from a version control system or behavioral log and it’s based on two
or behavioral log and it’s based on two
or behavioral log and it’s based on two thousand four hundred different commits
thousand four hundred different commits
thousand four hundred different commits and identifying patterns in those
and identifying patterns in those
and identifying patterns in those commits we were able to project a
commits we were able to project a
commits we were able to project a probability surface intuitive our code
probability surface intuitive our code
probability surface intuitive our code and using that probability surface we’re
and using that probability surface we’re
and using that probability surface we’re able to predict the most probable
able to predict the most probable
able to predict the most probable maintenance savings in that code now I’m
maintenance savings in that code now I’m
maintenance savings in that code now I’m going to walk you through this
going to walk you through this
going to walk you through this visualization in a minute but before I
visualization in a minute but before I
visualization in a minute but before I do that I just want to point out that a
do that I just want to point out that a
do that I just want to point out that a hotspot analysis like this it’s actually
hotspot analysis like this it’s actually
hotspot analysis like this it’s actually a social analysis because this data is
a social analysis because this data is
a social analysis because this data is based on the collective intelligence of
based on the collective intelligence of
based on the collective intelligence of all contributing offers all right so
all contributing offers all right so
all contributing offers all right so what you see you see that there are some
what you see you see that there are some
what you see you see that there are some large blue circles each one of those
large blue circles each one of those
large blue circles each one of those large blue circles represents a
large blue circles represents a
large blue circles represents a subsystem this is a hierarchical
subsystem this is a hierarchical
subsystem this is a hierarchical visualization so it pretty much follows
visualization so it pretty much follows
visualization so it pretty much follows the folder structure of your
the folder structure of your
the folder structure of your project and when you do detail and
project and when you do detail and
project and when you do detail and large-scale visualizations it’s also
large-scale visualizations it’s also
large-scale visualizations it’s also vital that you keep them interactive so
vital that you keep them interactive so
vital that you keep them interactive so that we can zoom in and now to the level
that we can zoom in and now to the level
that we can zoom in and now to the level of detail of interest and if you zoom in
of detail of interest and if you zoom in
of detail of interest and if you zoom in on one of those subsystems you will see
on one of those subsystems you will see
on one of those subsystems you will see that we represent each file as a circle
that we represent each file as a circle
that we represent each file as a circle you will also see that those circles
you will also see that those circles
you will also see that those circles have different sizes and that’s the
have different sizes and that’s the
have different sizes and that’s the course size is used to represent
course size is used to represent
course size is used to represent complexity so complexity is something
complexity so complexity is something
complexity so complexity is something you measure from the code and we have a
you measure from the code and we have a
you measure from the code and we have a bunch of different complexity measures
bunch of different complexity measures
bunch of different complexity measures to choose from and you can basically
to choose from and you can basically
to choose from and you can basically pick any metric you want because what
pick any metric you want because what
pick any metric you want because what they all have in common is that they are
they all have in common is that they are
they all have in common is that they are equally bad so I tend to go with the
equally bad so I tend to go with the
equally bad so I tend to go with the simplest possible thing I tend to pick
simplest possible thing I tend to pick
simplest possible thing I tend to pick the number of lines of code which also
the number of lines of code which also
the number of lines of code which also has the advantage of being language
has the advantage of being language
has the advantage of being language neutral but still I said a minute ago
neutral but still I said a minute ago
neutral but still I said a minute ago that complexity alone isn’t a problem so
that complexity alone isn’t a problem so
that complexity alone isn’t a problem so we need something else we need to
we need something else we need to
we need something else we need to understand if we actually work in that
understand if we actually work in that
understand if we actually work in that code or not we need to understand the
code or not we need to understand the
code or not we need to understand the change frequency of a code and this is
change frequency of a code and this is
change frequency of a code and this is something you pull out a reversion
something you pull out a reversion
something you pull out a reversion control later and the interesting thing
control later and the interesting thing
control later and the interesting thing here of course is we will combine these
here of course is we will combine these
here of course is we will combine these two dimensions because now we’re able to
two dimensions because now we’re able to
two dimensions because now we’re able to identify complicated code that we also
identify complicated code that we also
identify complicated code that we also had to work with often and I will show
had to work with often and I will show
had to work with often and I will show you a real-world case study of how this
you a real-world case study of how this
you a real-world case study of how this may look this is a study of microsoft’s
may look this is a study of microsoft’s
may look this is a study of microsoft’s the open source project Rozlyn and
the open source project Rozlyn and
the open source project Rozlyn and roastin is a really interesting project
roastin is a really interesting project
roastin is a really interesting project to study because ruslan is on open
to study because ruslan is on open
to study because ruslan is on open source compiler platform and it actually
source compiler platform and it actually
source compiler platform and it actually includes two compilers itself it
includes two compilers itself it
includes two compilers itself it includes the c-sharp compiler written in
includes the c-sharp compiler written in
includes the c-sharp compiler written in c-sharp and the visual basic compiler
c-sharp and the visual basic compiler
c-sharp and the visual basic compiler written in Visual Basic so Ruslan is a
written in Visual Basic so Ruslan is a
written in Visual Basic so Ruslan is a polyglot code base now if we look at the
polyglot code base now if we look at the
polyglot code base now if we look at the top hot spots in Ruslan right now you
top hot spots in Ruslan right now you
top hot spots in Ruslan right now you will see that the number one hot spot is
will see that the number one hot spot is
will see that the number one hot spot is something called command line tests and
something called command line tests and
something called command line tests and written in c-sharp another top hot spot
written in c-sharp another top hot spot
written in c-sharp another top hot spot in Ruslan is called command line tests
in Ruslan is called command line tests
in Ruslan is called command line tests and it’s written in Visual Basic
and it’s written in Visual Basic
and it’s written in Visual Basic hmmm I wonder if the some kind of
hmmm I wonder if the some kind of
hmmm I wonder if the some kind of relationship between those two in a few
relationship between those two in a few
relationship between those two in a few minutes we’re going to see a technique
minutes we’re going to see a technique
minutes we’re going to see a technique that helps us answer that question but
that helps us answer that question but
that helps us answer that question but for now I just want to point out that
for now I just want to point out that
for now I just want to point out that those hotspots they look tiny on-screen
those hotspots they look tiny on-screen
those hotspots they look tiny on-screen but that’s just across the size of
but that’s just across the size of
but that’s just across the size of Rosslyn Ruslan is huge what you see
Rosslyn Ruslan is huge what you see
Rosslyn Ruslan is huge what you see there is almost four million lines of
there is almost four million lines of
there is almost four million lines of code and each one of those command line
code and each one of those command line
code and each one of those command line tests are a file with 7,000 lines of
tests are a file with 7,000 lines of
tests are a file with 7,000 lines of code and I would also like to argue if
code and I would also like to argue if
code and I would also like to argue if you had 7,000 lines of code in a file
you had 7,000 lines of code in a file
you had 7,000 lines of code in a file called command line tests perhaps that
called command line tests perhaps that
called command line tests perhaps that isn’t a good unit of test and what I
isn’t a good unit of test and what I
isn’t a good unit of test and what I would do in this case is that I will
would do in this case is that I will
would do in this case is that I will look for the different responsibilities
look for the different responsibilities
look for the different responsibilities and start to break that file down into
and start to break that file down into
and start to break that file down into separate test Suites for example one for
separate test Suites for example one for
separate test Suites for example one for parsing one for validation one for read
parsing one for validation one for read
parsing one for validation one for read debug flags and if you do that each one
debug flags and if you do that each one
debug flags and if you do that each one of those new test files will of course
of those new test files will of course
of those new test files will of course become easier to reason about in
become easier to reason about in
become easier to reason about in isolation but that’s not the most
isolation but that’s not the most
isolation but that’s not the most important thing the most important thing
important thing the most important thing
important thing the most important thing is that you end up with an entirely new
is that you end up with an entirely new
is that you end up with an entirely new context because now if you continue to
context because now if you continue to
context because now if you continue to do a hotspot analysis like that you will
do a hotspot analysis like that you will
do a hotspot analysis like that you will be able to see which parts of a solution
be able to see which parts of a solution
be able to see which parts of a solution that you managed to stabilize and which
that you managed to stabilize and which
that you managed to stabilize and which parts that continue to change
parts that continue to change
parts that continue to change so cohesion is a tool that gives you
so cohesion is a tool that gives you
so cohesion is a tool that gives you additional insights into the evolution
additional insights into the evolution
additional insights into the evolution of your code another observation that
of your code another observation that
of your code another observation that made you saw that those two modules they
made you saw that those two modules they
made you saw that those two modules they were test code and this is again
were test code and this is again
were test code and this is again something I’ve found over and over again
something I’ve found over and over again
something I’ve found over and over again that the worst offenders in code tend to
that the worst offenders in code tend to
that the worst offenders in code tend to be in the test code and I think the main
be in the test code and I think the main
be in the test code and I think the main reason for that is because we developers
reason for that is because we developers
reason for that is because we developers we tend to make a mental divide on one
we tend to make a mental divide on one
we tend to make a mental divide on one hand we have that vacation code and we
hand we have that vacation code and we
hand we have that vacation code and we know it’s vital that we keep it clean
know it’s vital that we keep it clean
know it’s vital that we keep it clean that it’s possible to maintain and
that it’s possible to maintain and
that it’s possible to maintain and evolve it on the other hand we have the
evolve it on the other hand we have the
evolve it on the other hand we have the test code and most of the time we’re
test code and most of the time we’re
test code and most of the time we’re happy if we get around to write any of
happy if we get around to write any of
happy if we get around to write any of it at all and I think this is a
it at all and I think this is a
it at all and I think this is a dangerous fellow set because from a
dangerous fellow set because from a
dangerous fellow set because from a maintenance perspective there’s really
maintenance perspective there’s really
maintenance perspective there’s really no difference between test code and
no difference between test code and
no difference between test code and application code and if you have tests
application code and if you have tests
application code and if you have tests lacking quality they will hold you back
lacking quality they will hold you back
lacking quality they will hold you back all right so let’s see what the hotspots
all right so let’s see what the hotspots
all right so let’s see what the hotspots actually brought us when we added
actually brought us when we added
actually brought us when we added complexity dimension we’re able to
complexity dimension we’re able to
complexity dimension we’re able to narrow down the amount of code we need
narrow down the amount of code we need
narrow down the amount of code we need to investigate even more and a typically
to investigate even more and a typically
to investigate even more and a typically find that we’re able to narrow down to
find that we’re able to narrow down to
find that we’re able to narrow down to just three to six percent depending on
just three to six percent depending on
just three to six percent depending on the code base and this is important
the code base and this is important
the code base and this is important because those free to six percents they
because those free to six percents they
because those free to six percents they tell you which part of the code should
tell you which part of the code should
tell you which part of the code should we focus improvements on in order to get
we focus improvements on in order to get
we focus improvements on in order to get both improvement improvements in both
both improvement improvements in both
both improvement improvements in both productivity and quality and the reason
productivity and quality and the reason
productivity and quality and the reason I say quality is because hotspots tend
I say quality is because hotspots tend
I say quality is because hotspots tend to be strong predictor of defects all
to be strong predictor of defects all
to be strong predictor of defects all right so let’s leave the hotspots now
right so let’s leave the hotspots now
right so let’s leave the hotspots now for a while and talk about something
for a while and talk about something
for a while and talk about something entirely different let’s talk about a
entirely different let’s talk about a
entirely different let’s talk about a primary tool as software developers and
primary tool as software developers and
primary tool as software developers and I’m not talking about the compiler I’m
I’m not talking about the compiler I’m
I’m not talking about the compiler I’m not talking about the ID not Emacs not
not talking about the ID not Emacs not
not talking about the ID not Emacs not even vim I’m talking about the brain and
even vim I’m talking about the brain and
even vim I’m talking about the brain and the reason I want to talk about the
the reason I want to talk about the
the reason I want to talk about the brain is because your brain does not
brain is because your brain does not
brain is because your brain does not always work in your best interest and to
always work in your best interest and to
always work in your best interest and to show you an example I want to do a
show you an example I want to do a
show you an example I want to do a little poll here please think back to
little poll here please think back to
little poll here please think back to the last large project that you worked
the last large project that you worked
the last large project that you worked on perhaps the project you work on right
on perhaps the project you work on right
on perhaps the project you work on right now how many of you know where your
now how many of you know where your
now how many of you know where your hotspots are in that codebase a few of
hotspots are in that codebase a few of
hotspots are in that codebase a few of you ten people maybe please keep your
you ten people maybe please keep your
you ten people maybe please keep your hand up if you’re 100% certain few of
hand up if you’re 100% certain few of
hand up if you’re 100% certain few of your cool great you may well be right
your cool great you may well be right
your cool great you may well be right what worries me though is that if 100
what worries me though is that if 100
what worries me though is that if 100 years of psychological research has
years of psychological research has
years of psychological research has taught us anything it is that we humans
taught us anything it is that we humans
taught us anything it is that we humans we can’t really trust our own judgment
we can’t really trust our own judgment
we can’t really trust our own judgment and to explain what I mean about that we
and to explain what I mean about that we
and to explain what I mean about that we need to talk about something different
need to talk about something different
need to talk about something different yes that’s right we need to talk about
yes that’s right we need to talk about
yes that’s right we need to talk about gorillas so this is one of my favorite
gorillas so this is one of my favorite
gorillas so this is one of my favorite psychological experiments you may well
psychological experiments you may well
psychological experiments you may well have heard about it’s quite famous which
have heard about it’s quite famous which
have heard about it’s quite famous which was done back in the 90s and what
was done back in the 90s and what
was done back in the 90s and what researchers did here was that they
researchers did here was that they
researchers did here was that they record a vid
record a vid
record a vid of two teams that play basketball
of two teams that play basketball
of two teams that play basketball against each other and your task as a
against each other and your task as a
against each other and your task as a participant in that experiment was to
participant in that experiment was to
participant in that experiment was to count the number of process now as you
count the number of process now as you
count the number of process now as you sat down and watch the two teams play
sat down and watch the two teams play
sat down and watch the two teams play basketball something bit strange
basketball something bit strange
basketball something bit strange happened because suddenly a man dressed
happened because suddenly a man dressed
happened because suddenly a man dressed in a gorilla suit walks across the
in a gorilla suit walks across the
in a gorilla suit walks across the basketball field stops right in front of
basketball field stops right in front of
basketball field stops right in front of the camera turns towards you and starts
the camera turns towards you and starts
the camera turns towards you and starts to beat on his chest then he walks off
to beat on his chest then he walks off
to beat on his chest then he walks off after you’ve seen that video the
after you’ve seen that video the
after you’ve seen that video the researchers will ask you did you notice
researchers will ask you did you notice
researchers will ask you did you notice anything particular and you would say of
anything particular and you would say of
anything particular and you would say of course a man resting a real a suit
course a man resting a real a suit
course a man resting a real a suit that’s sure a bit odd but that’s not
that’s sure a bit odd but that’s not
that’s sure a bit odd but that’s not what happened it turned out then more
what happened it turned out then more
what happened it turned out then more than 50% of the participants failed to
than 50% of the participants failed to
than 50% of the participants failed to see the gorilla and the follow-up
see the gorilla and the follow-up
see the gorilla and the follow-up experiment revealed that people fail to
experiment revealed that people fail to
experiment revealed that people fail to see the gorilla even when it’s right in
see the gorilla even when it’s right in
see the gorilla even when it’s right in front of their eyes even when the image
front of their eyes even when the image
front of their eyes even when the image of the gorilla hits a retinas we failed
of the gorilla hits a retinas we failed
of the gorilla hits a retinas we failed to see it and the reason for that is
to see it and the reason for that is
to see it and the reason for that is because you don’t see with your eyes you
because you don’t see with your eyes you
because you don’t see with your eyes you see with your brain and in order to
see with your brain and in order to
see with your brain and in order to perceive something we need to focus our
perceive something we need to focus our
perceive something we need to focus our attention on it but your attention was
attention on it but your attention was
attention on it but your attention was directed to calculate the number of
directed to calculate the number of
directed to calculate the number of passes and the question for us here is
passes and the question for us here is
passes and the question for us here is if we’re humans if we are capable of
if we’re humans if we are capable of
if we’re humans if we are capable of missing something as obvious as a
missing something as obvious as a
missing something as obvious as a gorilla what’s the risk that we will
gorilla what’s the risk that we will
gorilla what’s the risk that we will miss the gorillas in our own code what’s
miss the gorillas in our own code what’s
miss the gorillas in our own code what’s the risk that we will overlook our hot
the risk that we will overlook our hot
the risk that we will overlook our hot spots and I think it goes deeper than
spots and I think it goes deeper than
spots and I think it goes deeper than that
that
that because now I talked about the failure
because now I talked about the failure
because now I talked about the failure of the human mind a little bit but it
of the human mind a little bit but it
of the human mind a little bit but it turns out we humans are actually really
turns out we humans are actually really
turns out we humans are actually really good at some things and one thing we are
good at some things and one thing we are
good at some things and one thing we are exceptionally good at that is to
exceptionally good at that is to
exceptionally good at that is to rationalize decisions and believes that
rationalize decisions and believes that
rationalize decisions and believes that we don’t even share so let me explain
we don’t even share so let me explain
we don’t even share so let me explain how that works this is something
how that works this is something
how that works this is something completely different if you took part in
completely different if you took part in
completely different if you took part in this experiment what happened was that
this experiment what happened was that
this experiment what happened was that you get shown two pictures of two
you get shown two pictures of two
you get shown two pictures of two different faces and your task Assaf
different faces and your task Assaf
different faces and your task Assaf participant is to select
participant is to select
participant is to select face you find the most attractive once
face you find the most attractive once
face you find the most attractive once you have made your selection the
you have made your selection the
you have made your selection the researchers will hand your copy of that
researchers will hand your copy of that
researchers will hand your copy of that photo only they don’t they trick you so
photo only they don’t they trick you so
photo only they don’t they trick you so you actually receive a copy of the
you actually receive a copy of the
you actually receive a copy of the folder that you didn’t pick and now they
folder that you didn’t pick and now they
folder that you didn’t pick and now they do something really really evil because
do something really really evil because
do something really really evil because they ask you please motivate your choice
they ask you please motivate your choice
they ask you please motivate your choice interesting let’s think about that for a
interesting let’s think about that for a
interesting let’s think about that for a while so we sit there with a copy of a
while so we sit there with a copy of a
while so we sit there with a copy of a photo that we didn’t choose and are now
photo that we didn’t choose and are now
photo that we didn’t choose and are now asked to motivate a choice that we
asked to motivate a choice that we
asked to motivate a choice that we didn’t do and again you would think that
didn’t do and again you would think that
didn’t do and again you would think that if something like that happens to you
if something like that happens to you
if something like that happens to you you would of course notice immediately
you would of course notice immediately
you would of course notice immediately and again that’s not what happened it
and again that’s not what happened it
and again that’s not what happened it turned out that more than two-thirds of
turned out that more than two-thirds of
turned out that more than two-thirds of the participants failed to notice the
the participants failed to notice the
the participants failed to notice the swap and if you read original research
swap and if you read original research
swap and if you read original research is really great because they have a
is really great because they have a
is really great because they have a transcription of the interviews and the
transcription of the interviews and the
transcription of the interviews and the motives that people gave so for example
motives that people gave so for example
motives that people gave so for example you had this woman who she says yeah I
you had this woman who she says yeah I
you had this woman who she says yeah I picked this folder because I really
picked this folder because I really
picked this folder because I really loved the earrings the folder that you
loved the earrings the folder that you
loved the earrings the folder that you actually picked don’t show any earrings
actually picked don’t show any earrings
actually picked don’t show any earrings at all and of course that’s this really
at all and of course that’s this really
at all and of course that’s this really confident man who says yeah I picked
confident man who says yeah I picked
confident man who says yeah I picked this folder because I really prefer
this folder because I really prefer
this folder because I really prefer blondes in reality the photo he picked
blondes in reality the photo he picked
blondes in reality the photo he picked showed the dark-haired woman now would
showed the dark-haired woman now would
showed the dark-haired woman now would have just told you about our two
have just told you about our two
have just told you about our two examples of cognitive biases on an
examples of cognitive biases on an
examples of cognitive biases on an individual level but if you want to see
individual level but if you want to see
individual level but if you want to see a real disaster here’s what you do
a real disaster here’s what you do
a real disaster here’s what you do you take a number of individuals put
you take a number of individuals put
you take a number of individuals put them together and call that a team and
them together and call that a team and
them together and call that a team and to explain what I mean we need to travel
to explain what I mean we need to travel
to explain what I mean we need to travel into what perhaps may look at some
into what perhaps may look at some
into what perhaps may look at some unethical corners but I promise you I
unethical corners but I promise you I
unethical corners but I promise you I will keep this nice so please just relax
will keep this nice so please just relax
will keep this nice so please just relax let me ask you a question instead how
let me ask you a question instead how
let me ask you a question instead how many of you have been given the advice
many of you have been given the advice
many of you have been given the advice that if you want to make a real impact
that if you want to make a real impact
that if you want to make a real impact in a meeting you should speak first just
in a meeting you should speak first just
in a meeting you should speak first just a few of you well in this context it’s
a few of you well in this context it’s
a few of you well in this context it’s actually good advice because it turns
actually good advice because it turns
actually good advice because it turns out that that
out that that
out that that first person who speaks in a meeting
first person who speaks in a meeting
first person who speaks in a meeting will buy us the whole discussion will
will buy us the whole discussion will
will buy us the whole discussion will buy us the whole group but there’s an
buy us the whole group but there’s an
buy us the whole group but there’s an even sneakier way to get what you want
even sneakier way to get what you want
even sneakier way to get what you want and this is something called vocal
and this is something called vocal
and this is something called vocal minorities and vocal minorities are
minorities and vocal minorities are
minorities and vocal minorities are based upon the fact that we people when
based upon the fact that we people when
based upon the fact that we people when we hear an opinion repeated over and
we hear an opinion repeated over and
we hear an opinion repeated over and over again we come to believe that that
over again we come to believe that that
over again we come to believe that that opinion is more popular and widespread
opinion is more popular and widespread
opinion is more popular and widespread than it actually is and that’s true even
than it actually is and that’s true even
than it actually is and that’s true even when it’s the same person repeating the
when it’s the same person repeating the
when it’s the same person repeating the same opinion over and over again so all
same opinion over and over again so all
same opinion over and over again so all I have to do in order to manipulate you
I have to do in order to manipulate you
I have to do in order to manipulate you is to keep repeating things like do you
is to keep repeating things like do you
is to keep repeating things like do you know that common lisp is a great
know that common lisp is a great
know that common lisp is a great programming language
programming language
programming language have you seen common lisp it’s an
have you seen common lisp it’s an
have you seen common lisp it’s an amazing language now I try to be a good
amazing language now I try to be a good
amazing language now I try to be a good person so I will only manipulate you in
person so I will only manipulate you in
person so I will only manipulate you in a good way common lisp is indeed great
a good way common lisp is indeed great
a good way common lisp is indeed great but what if it was the other way around
but what if it was the other way around
but what if it was the other way around for example let’s say someone complains
for example let’s say someone complains
for example let’s say someone complains a lot
a lot
a lot have you seen that network module code
have you seen that network module code
have you seen that network module code it’s crap that network module code we
it’s crap that network module code we
it’s crap that network module code we just have to throw it away it’s so lousy
just have to throw it away it’s so lousy
just have to throw it away it’s so lousy how do you think that opinion will
how do you think that opinion will
how do you think that opinion will affect your idea on where we’re through
affect your idea on where we’re through
affect your idea on where we’re through hotspots are and the reason I tell you
hotspots are and the reason I tell you
hotspots are and the reason I tell you this is because I really really really
this is because I really really really
this is because I really really really want to make the case from a next slide
want to make the case from a next slide
want to make the case from a next slide because this is probably the most
because this is probably the most
because this is probably the most important thing I’m going to tell you
important thing I’m going to tell you
important thing I’m going to tell you today do use your intuition do use your
today do use your intuition do use your
today do use your intuition do use your expertise but make sure to support your
expertise but make sure to support your
expertise but make sure to support your decisions with data all right let’s move
decisions with data all right let’s move
decisions with data all right let’s move on from gorillas and groups and all this
on from gorillas and groups and all this
on from gorillas and groups and all this stuff I talked about change patterns in
stuff I talked about change patterns in
stuff I talked about change patterns in our plications and I want to talk about
our plications and I want to talk about
our plications and I want to talk about surprise and the cost of surprise and
surprise and the cost of surprise and
surprise and the cost of surprise and the reason I want to talk about surprise
the reason I want to talk about surprise
the reason I want to talk about surprise is because surprise is one of the most
is because surprise is one of the most
is because surprise is one of the most expensive things you can put into a
expensive things you can put into a
expensive things you can put into a software architecture and there are
software architecture and there are
software architecture and there are different kinds of surprises
I like to show you the first kind of
I like to show you the first kind of surprise by showing you what has to be
surprise by showing you what has to be
surprise by showing you what has to be my all-time favorite code this is really
my all-time favorite code this is really
my all-time favorite code this is really a work of art this is the code from the
a work of art this is the code from the
a work of art this is the code from the Apollo project so this is the code that
Apollo project so this is the code that
Apollo project so this is the code that actually took us to the moon so please
actually took us to the moon so please
actually took us to the moon so please have a look at this beauty in particular
have a look at this beauty in particular
have a look at this beauty in particular focus on the comments how many of you
focus on the comments how many of you
focus on the comments how many of you want to go to the moon on that code so
one could argue that this is a surprise
one could argue that this is a surprise to the end user that’s not the kind of
to the end user that’s not the kind of
to the end user that’s not the kind of surprise I want to talk about today I
surprise I want to talk about today I
surprise I want to talk about today I want to talk about the surprise we
want to talk about the surprise we
want to talk about the surprise we developers leave behind for the poor
developers leave behind for the poor
developers leave behind for the poor maintenance program are coming after us
maintenance program are coming after us
maintenance program are coming after us and I want to show you how a concept
and I want to show you how a concept
and I want to show you how a concept called temporary coupling helps us
called temporary coupling helps us
called temporary coupling helps us uncover those surprises
uncover those surprises
uncover those surprises now temporal coupling is interesting
now temporal coupling is interesting
now temporal coupling is interesting because it’s so different from the way
because it’s so different from the way
because it’s so different from the way we developers typically talk about
we developers typically talk about
we developers typically talk about coupling when we developers talk about
coupling when we developers talk about
coupling when we developers talk about coupling what we typically mean is a
coupling what we typically mean is a
coupling what we typically mean is a dependency between different parts and
dependency between different parts and
dependency between different parts and pieces temporal coupling is different
pieces temporal coupling is different
pieces temporal coupling is different because it’s not measured from code
because it’s not measured from code
because it’s not measured from code temporal coupling is measured from the
temporal coupling is measured from the
temporal coupling is measured from the evolution of your code so this is
evolution of your code so this is
evolution of your code so this is something we pull out of version control
something we pull out of version control
something we pull out of version control later and temporal coupling is about
later and temporal coupling is about
later and temporal coupling is about files two or more that keep changing
files two or more that keep changing
files two or more that keep changing together over time perhaps even in the
together over time perhaps even in the
together over time perhaps even in the same comment and I want to show you how
same comment and I want to show you how
same comment and I want to show you how that looks by looking at another real
that looks by looking at another real
that looks by looking at another real word system this is a case study from
word system this is a case study from
word system this is a case study from asp.net MVC an asp.net MVC they tend to
asp.net MVC an asp.net MVC they tend to
asp.net MVC an asp.net MVC they tend to focus a lot on automated tests so if you
focus a lot on automated tests so if you
focus a lot on automated tests so if you look at their code base they actually
look at their code base they actually
look at their code base they actually have more test code than application
have more test code than application
have more test code than application code and that’s a consequence if you do
code and that’s a consequence if you do
code and that’s a consequence if you do our temporal coupling analysis you will
our temporal coupling analysis you will
our temporal coupling analysis you will find that most of your tempura couples
find that most of your tempura couples
find that most of your tempura couples or between test code and the unit under
or between test code and the unit under
or between test code and the unit under test and this is not a surprise at all
test and this is not a surprise at all
test and this is not a surprise at all this is actually what we expect in fact
this is actually what we expect in fact
this is actually what we expect in fact I would be worried if that temporal
I would be worried if that temporal
I would be worried if that temporal dependency wasn’t there because it
dependency wasn’t there because it
dependency wasn’t there because it probably means our tests aren’t being
probably means our tests aren’t being
probably means our tests aren’t being kept up to date so what I tend to look
kept up to date so what I tend to look
kept up to date so what I tend to look for instead or
for instead or
for instead or our couples were had no easy explanation
our couples were had no easy explanation
our couples were had no easy explanation perhaps something like this so in that
perhaps something like this so in that
perhaps something like this so in that code base we have two different files
code base we have two different files
code base we have two different files one is called script tag helper and one
one is called script tag helper and one
one is called script tag helper and one is called a link tag helper and if you
is called a link tag helper and if you
is called a link tag helper and if you look at the code you will see that
look at the code you will see that
look at the code you will see that there’s no immediate dependency between
there’s no immediate dependency between
there’s no immediate dependency between them yet they keep changing together in
them yet they keep changing together in
them yet they keep changing together in 89 percent of all commits how can that
89 percent of all commits how can that
89 percent of all commits how can that happen when I find something like that I
happen when I find something like that I
happen when I find something like that I always look at the code but today I’m
always look at the code but today I’m
always look at the code but today I’m going to delegate that responsibility to
going to delegate that responsibility to
going to delegate that responsibility to you so what will happen now is that I’m
you so what will happen now is that I’m
you so what will happen now is that I’m going to put up a copy of the script tag
going to put up a copy of the script tag
going to put up a copy of the script tag helper next to the link tag helper are
helper next to the link tag helper are
helper next to the link tag helper are you ready
you ready
you ready your task is to see if you can spot some
your task is to see if you can spot some
your task is to see if you can spot some kind of subtle pattern here already here
kind of subtle pattern here already here
kind of subtle pattern here already here we go here’s the script tag helper let’s
we go here’s the script tag helper let’s
we go here’s the script tag helper let’s put the link tag helper next it anyone
put the link tag helper next it anyone
put the link tag helper next it anyone notice a pattern yeah this is what I
notice a pattern yeah this is what I
notice a pattern yeah this is what I find in quite a many cases a dear old
find in quite a many cases a dear old
find in quite a many cases a dear old friend of mine copy/paste but I think
friend of mine copy/paste but I think
friend of mine copy/paste but I think I’m being a little bit unfair here
I’m being a little bit unfair here
I’m being a little bit unfair here because this is not really copy/paste
because this is not really copy/paste
because this is not really copy/paste because they have done something where
because they have done something where
because they have done something where they have actually updated the copy
they have actually updated the copy
they have actually updated the copy pasted property names and even more rare
pasted property names and even more rare
pasted property names and even more rare this is something you almost never find
this is something you almost never find
this is something you almost never find they’ve updated it copy paste the
they’ve updated it copy paste the
they’ve updated it copy paste the documentation so this my friends this is
documentation so this my friends this is
documentation so this my friends this is not really copy paste this is more like
not really copy paste this is more like
not really copy paste this is more like copy paste with a gold plating but still
copy paste with a gold plating but still
copy paste with a gold plating but still tempura coupling is a great tool to
tempura coupling is a great tool to
tempura coupling is a great tool to uncover surprises in our own code but we
uncover surprises in our own code but we
uncover surprises in our own code but we can do even more with it we can use it
can do even more with it we can use it
can do even more with it we can use it to analyze complete software
to analyze complete software
to analyze complete software architectures and I’d like to show you
architectures and I’d like to show you
architectures and I’d like to show you an example from micro-services why
an example from micro-services why
an example from micro-services why micro-services the course right now
micro-services the course right now
micro-services the course right now micro services are all the rage and that
micro services are all the rage and that
micro services are all the rage and that just means that tomorrow’s legacy
just means that tomorrow’s legacy
just means that tomorrow’s legacy systems are going to be micro services
systems are going to be micro services
systems are going to be micro services so this is what micro services look like
so this is what micro services look like
so this is what micro services look like in PowerPoint
in PowerPoint
in PowerPoint in reality they tend to look much more
in reality they tend to look much more
in reality they tend to look much more like this so we are developers we have
like this so we are developers we have
like this so we are developers we have learned we need to abstract away shared
learned we need to abstract away shared
learned we need to abstract away shared functionality so perhaps we introduce a
functionality so perhaps we introduce a
functionality so perhaps we introduce a shared communication library we notice
shared communication library we notice
shared communication library we notice shared database access param so let’s
shared database access param so let’s
shared database access param so let’s abstract away those as well and of
abstract away those as well and of
abstract away those as well and of course we want to follow the
course we want to follow the
course we want to follow the recommendation of providing a shared
recommendation of providing a shared
recommendation of providing a shared service template so that all of our
service template so that all of our
service template so that all of our services behave in the same way and
services behave in the same way and
services behave in the same way and obviously we want to make our services
obviously we want to make our services
obviously we want to make our services easiest consume as possible so let’s
easiest consume as possible so let’s
easiest consume as possible so let’s introduce a bunch of client libraries
introduce a bunch of client libraries
introduce a bunch of client libraries now each one of those choices may well
now each one of those choices may well
now each one of those choices may well be good what you have to watch out for
be good what you have to watch out for
be good what you have to watch out for though is that in micro-services loose
though is that in micro-services loose
though is that in micro-services loose coupling is king as soon as we start to
coupling is king as soon as we start to
coupling is king as soon as we start to couple different services to each other
couple different services to each other
couple different services to each other we lose most of the advantages of a
we lose most of the advantages of a
we lose most of the advantages of a micro service architecture and are left
micro service architecture and are left
micro service architecture and are left with an excess mess of complexity so
with an excess mess of complexity so
with an excess mess of complexity so what I suggest is that we use temporal
what I suggest is that we use temporal
what I suggest is that we use temporal coupling not on individual files but on
coupling not on individual files but on
coupling not on individual files but on a service level and we look out for
a service level and we look out for
a service level and we look out for surprising patterns like this where
surprising patterns like this where
surprising patterns like this where multiple services change together due to
multiple services change together due to
multiple services change together due to a shared library or even worse when
a shared library or even worse when
a shared library or even worse when multiple services evolve together like
multiple services evolve together like
multiple services evolve together like like this and if you do a temporal
like this and if you do a temporal
like this and if you do a temporal coupling analysis like that regularly
coupling analysis like that regularly
coupling analysis like that regularly you will be able to detect such warning
you will be able to detect such warning
you will be able to detect such warning signs in your architecture early so that
signs in your architecture early so that
signs in your architecture early so that you can react on time so remember that I
you can react on time so remember that I
you can react on time so remember that I started out this presentation by talking
started out this presentation by talking
started out this presentation by talking about organizations and I would like to
about organizations and I would like to
about organizations and I would like to take it a step further and actually
take it a step further and actually
take it a step further and actually claim that most organizational problems
claim that most organizational problems
claim that most organizational problems are mistaken as technical issues and the
are mistaken as technical issues and the
are mistaken as technical issues and the main reason for that is because social
main reason for that is because social
main reason for that is because social information is something that’s
information is something that’s
information is something that’s invisible in the code it’s just not
invisible in the code it’s just not
invisible in the code it’s just not there and in order to tackle those
there and in order to tackle those
there and in order to tackle those problems we need to combine our code
problems we need to combine our code
problems we need to combine our code with social information here’s one
with social information here’s one
with social information here’s one approach this is a case study of another
approach this is a case study of another
approach this is a case study of another open-source project this is the
open-source project this is the
open-source project this is the development of the closure programming
development of the closure programming
development of the closure programming language and this is a visualization
language and this is a visualization
language and this is a visualization called fractal figures a fractal figures
called fractal figures a fractal figures
called fractal figures a fractal figures works like this you consider each file
works like this you consider each file
works like this you consider each file as a box and each programmer gets
as a box and each programmer gets
as a box and each programmer gets assigned a color and the more that
assigned a color and the more that
assigned a color and the more that programmer has contributed to code the
programmer has contributed to code the
programmer has contributed to code the larger their area of the box now you can
larger their area of the box now you can
larger their area of the box now you can use fractal figures for a lot of
use fractal figures for a lot of
use fractal figures for a lot of different things for example if add a
different things for example if add a
different things for example if add a color legend
color legend
color legend we get a useful communication tool let’s
we get a useful communication tool let’s
we get a useful communication tool let’s say that we join this project we want to
say that we join this project we want to
say that we join this project we want to contribute to closure and we want to
contribute to closure and we want to
contribute to closure and we want to contribute to the evaluation module in
contribute to the evaluation module in
contribute to the evaluation module in your top left corner we see that that
your top left corner we see that that
your top left corner we see that that code is written by Stewart Ella way so
code is written by Stewart Ella way so
code is written by Stewart Ella way so if we have a question about that code
if we have a question about that code
if we have a question about that code Stewart probably knows a lot about it
Stewart probably knows a lot about it
Stewart probably knows a lot about it we also see that closure in general is
we also see that closure in general is
we also see that closure in general is written by the dark blue developer so if
written by the dark blue developer so if
written by the dark blue developer so if we have a that’s rich Hickey so if you
we have a that’s rich Hickey so if you
we have a that’s rich Hickey so if you ever questioned by closure in general
ever questioned by closure in general
ever questioned by closure in general well recheck it probably knows a thing
well recheck it probably knows a thing
well recheck it probably knows a thing or two about it but fractal figures are
or two about it but fractal figures are
or two about it but fractal figures are useful even without the color legend
useful even without the color legend
useful even without the color legend because now we want to look out for
because now we want to look out for
because now we want to look out for surprising patterns like this where we
surprising patterns like this where we
surprising patterns like this where we have 20 further different developers
have 20 further different developers
have 20 further different developers contributing to the same piece of code
contributing to the same piece of code
contributing to the same piece of code and the reason we want to look out for
and the reason we want to look out for
and the reason we want to look out for that is because research has taught us
that is because research has taught us
that is because research has taught us that the number of developers behind a
that the number of developers behind a
that the number of developers behind a piece of code is one of the best
piece of code is one of the best
piece of code is one of the best predictor on the number of quality
predictor on the number of quality
predictor on the number of quality issues you will find and fractal figures
issues you will find and fractal figures
issues you will find and fractal figures helps you identify those modules at risk
helps you identify those modules at risk
helps you identify those modules at risk for defects fractal figures also
for defects fractal figures also
for defects fractal figures also explains a lot about our hot spots
sometimes I come along pretty old code
sometimes I come along pretty old code bases code that’s been around for 10-15
bases code that’s been around for 10-15
bases code that’s been around for 10-15 years and what I tend to find in those
years and what I tend to find in those
years and what I tend to find in those code bases is that most of the code is
code bases is that most of the code is
code bases is that most of the code is stable and then we have a number of
stable and then we have a number of
stable and then we have a number of really red glowing hot spots in the
really red glowing hot spots in the
really red glowing hot spots in the central parts of that code and when I
central parts of that code and when I
central parts of that code and when I find something like that always look
find something like that always look
find something like that always look back in time and I see that those hot
back in time and I see that those hot
back in time and I see that those hot spots have been around for years
spots have been around for years
spots have been around for years so the question a folk risk for of
so the question a folk risk for of
so the question a folk risk for of course why haven’t anyone done anything
course why haven’t anyone done anything
course why haven’t anyone done anything about them
about them
about them why haven’t they improved the code do
why haven’t they improved the code do
why haven’t they improved the code do you know the reason why much existing
you know the reason why much existing
you know the reason why much existing code is never improved the reason is
code is never improved the reason is
code is never improved the reason is because the fractal figures looks like
because the fractal figures looks like
because the fractal figures looks like this so you have fir the people that
this so you have fir the people that
this so you have fir the people that work in that code all the time which
work in that code all the time which
work in that code all the time which means you will impact the work of all
means you will impact the work of all
means you will impact the work of all those people if you try to redesign that
those people if you try to redesign that
those people if you try to redesign that piece and this leaves us in a very
piece and this leaves us in a very
piece and this leaves us in a very unfortunate situation that are called
unfortunate situation that are called
unfortunate situation that are called immutable design and please trust me on
immutable design and please trust me on
immutable design and please trust me on this one I’m a functional programmer but
this one I’m a functional programmer but
this one I’m a functional programmer but in this context there’s nothing good at
in this context there’s nothing good at
in this context there’s nothing good at with immutability and I find it ironic
with immutability and I find it ironic
with immutability and I find it ironic that we cannot improve the code because
that we cannot improve the code because
that we cannot improve the code because we are so many people working on it in
we are so many people working on it in
we are so many people working on it in parallel and we have to be so many
parallel and we have to be so many
parallel and we have to be so many people the cross we cannot improve the
people the cross we cannot improve the
people the cross we cannot improve the code all right let’s move on let’s we’ve
code all right let’s move on let’s we’ve
code all right let’s move on let’s we’ve just talked about knowledge distribution
just talked about knowledge distribution
just talked about knowledge distribution let’s turn an eye towards our blind
let’s turn an eye towards our blind
let’s turn an eye towards our blind spots and I want to tell you a little
spots and I want to tell you a little
spots and I want to tell you a little story about Paul Phillips that you see
story about Paul Phillips that you see
story about Paul Phillips that you see here on screen Paul Phillips used to
here on screen Paul Phillips used to
here on screen Paul Phillips used to work in the Scala code base
work in the Scala code base
work in the Scala code base any word on Scala for five years and
any word on Scala for five years and
any word on Scala for five years and during those five years Paul Phillips
during those five years Paul Phillips
during those five years Paul Phillips was the number one contributor to Scala
was the number one contributor to Scala
was the number one contributor to Scala then two years ago Paul Phillips made
then two years ago Paul Phillips made
then two years ago Paul Phillips made this excellent presentation that I
this excellent presentation that I
this excellent presentation that I really recommend and linked there where
really recommend and linked there where
really recommend and linked there where he announces his decision to step back
he announces his decision to step back
he announces his decision to step back and stop contributing to Scala so far
and stop contributing to Scala so far
and stop contributing to Scala so far this is an excellent opportunity to see
this is an excellent opportunity to see
this is an excellent opportunity to see what happens when a main developer
what happens when a main developer
what happens when a main developer leaves so I did a study on knowledge
leaves so I did a study on knowledge
leaves so I did a study on knowledge loss and this is two years after Paul
loss and this is two years after Paul
loss and this is two years after Paul Phillips has left here’s what the
Phillips has left here’s what the
Phillips has left here’s what the knowledge loss looks like in Scala you
knowledge loss looks like in Scala you
knowledge loss looks like in Scala you see those red areas in this case they
see those red areas in this case they
see those red areas in this case they don’t represent any hot spots no they
don’t represent any hot spots no they
don’t represent any hot spots no they represent abandoned code that is code
represent abandoned code that is code
represent abandoned code that is code that’s written by developer who is no
that’s written by developer who is no
that’s written by developer who is no longer part of the organization and this
longer part of the organization and this
longer part of the organization and this is something that you of course can use
is something that you of course can use
is something that you of course can use to reason about knowledge distribution
to reason about knowledge distribution
to reason about knowledge distribution you see that large subsystem to your
you see that large subsystem to your
you see that large subsystem to your left that’s something called a compiler
left that’s something called a compiler
left that’s something called a compiler which may be important but you can also
which may be important but you can also
which may be important but you can also use it a bit more proactively and look
use it a bit more proactively and look
use it a bit more proactively and look for things like this where you have an
for things like this where you have an
for things like this where you have an entire abandoned subsystem in this case
entire abandoned subsystem in this case
entire abandoned subsystem in this case they’re readable print loop so use it in
they’re readable print loop so use it in
they’re readable print loop so use it in case you know that you’re planning some
case you know that you’re planning some
case you know that you’re planning some features there you see to schedule some
features there you see to schedule some
features there you see to schedule some additional time for learning because it
additional time for learning because it
additional time for learning because it is a hugely increased risk to modify
is a hugely increased risk to modify
is a hugely increased risk to modify code that we no longer understand all
code that we no longer understand all
code that we no longer understand all right
right
right I’m at my final observation now I have
I’m at my final observation now I have
I’m at my final observation now I have found to succeed with maintainable code
found to succeed with maintainable code
found to succeed with maintainable code bases we need to make it fun I work at a
bases we need to make it fun I work at a
bases we need to make it fun I work at a company called ampere and we do most of
company called ampere and we do most of
company called ampere and we do most of our development in closure and people
our development in closure and people
our development in closure and people often ask me why did you choose closure
often ask me why did you choose closure
often ask me why did you choose closure and I could of course tell them stuff
and I could of course tell them stuff
and I could of course tell them stuff like yeah you know we do data analysis
like yeah you know we do data analysis
like yeah you know we do data analysis and closure it’s an ex
and closure it’s an ex
and closure it’s an ex and tool for data analysis the thing is
and tool for data analysis the thing is
and tool for data analysis the thing is I didn’t know that back when we started
I didn’t know that back when we started
I didn’t know that back when we started I was just lucky the reason of the
I was just lucky the reason of the
I was just lucky the reason of the closure has nothing to do with
closure has nothing to do with
closure has nothing to do with technology
technology
technology epic closure because it looked fun I
epic closure because it looked fun I
epic closure because it looked fun I wanted to learn the language and I think
wanted to learn the language and I think
wanted to learn the language and I think that fun is a much underestimate the
that fun is a much underestimate the
that fun is a much underestimate the driver of the sign in fact fun is a much
driver of the sign in fact fun is a much
driver of the sign in fact fun is a much underestimate the motivator in the
underestimate the motivator in the
underestimate the motivator in the software industry because fun is
software industry because fun is
software industry because fun is virtually a guarantee that things get
virtually a guarantee that things get
virtually a guarantee that things get done so if you work in a large code base
done so if you work in a large code base
done so if you work in a large code base always remember to put the fun into it
always remember to put the fun into it
always remember to put the fun into it even if you’re locked down in your
even if you’re locked down in your
even if you’re locked down in your choice of technology and platform
choice of technology and platform
choice of technology and platform there’s always a lot of supporting code
there’s always a lot of supporting code
there’s always a lot of supporting code to write around it a lot of tasks to
to write around it a lot of tasks to
to write around it a lot of tasks to automate so pick a mundane task use of
automate so pick a mundane task use of
automate so pick a mundane task use of technology of your choice to automate it
technology of your choice to automate it
technology of your choice to automate it and turn it into a learning experience
and turn it into a learning experience
and turn it into a learning experience your code is going to thank you for it
your code is going to thank you for it
your code is going to thank you for it so I’m done now and before I take some
so I’m done now and before I take some
so I’m done now and before I take some questions
questions
questions I just want to take this opportunity and
I just want to take this opportunity and
I just want to take this opportunity and say thanks a lot for listening to me and
say thanks a lot for listening to me and
say thanks a lot for listening to me and please remember that Common Lisp is a
please remember that Common Lisp is a
please remember that Common Lisp is a great language thanks
so thank you so the first question that
so thank you so the first question that I had like five times how can I get this
I had like five times how can I get this
I had like five times how can I get this how how can I get these metrics myself
how how can I get these metrics myself
how how can I get these metrics myself on my own code base yeah so that’s a
on my own code base yeah so that’s a
on my own code base yeah so that’s a question that often get what kind of
question that often get what kind of
question that often get what kind of tools do I use and the rear I actually
tools do I use and the rear I actually
tools do I use and the rear I actually use my own tools and the reason I do
use my own tools and the reason I do
use my own tools and the reason I do that is because one started out with
that is because one started out with
that is because one started out with this there were no tools available that
this there were no tools available that
this there were no tools available that could do does kind of analyze this I
could do does kind of analyze this I
could do does kind of analyze this I wanted to do so I’ve racked my own ones
wanted to do so I’ve racked my own ones
wanted to do so I’ve racked my own ones and the last year I decided to focus
and the last year I decided to focus
and the last year I decided to focus full-time on that so I know how my
full-time on that so I know how my
full-time on that so I know how my startup and peer were developed those
startup and peer were developed those
startup and peer were developed those tools and we actually have some tools
tools and we actually have some tools
tools and we actually have some tools available now and what will come up soon
available now and what will come up soon
available now and what will come up soon is our service so that you can actually
is our service so that you can actually
is our service so that you can actually sign up and get an analysis of all your
sign up and get an analysis of all your
sign up and get an analysis of all your code for that service and we are
code for that service and we are
code for that service and we are probably launching a preview quite soon
probably launching a preview quite soon
probably launching a preview quite soon so sign in if you want to try that oli
so sign in if you want to try that oli
so sign in if you want to try that oli does that hook up against github very
does that hook up against github very
does that hook up against github very similar yes it does okay good so um
similar yes it does okay good so um
similar yes it does okay good so um another question here is is number of
another question here is is number of
another question here is is number of times that a file was modified really a
times that a file was modified really a
times that a file was modified really a good measure because sometimes you might
good measure because sometimes you might
good measure because sometimes you might within a day modify a file you know 60
within a day modify a file you know 60
within a day modify a file you know 60 times yeah so there were actually two
times yeah so there were actually two
times yeah so there were actually two questions there first of all yes the
questions there first of all yes the
questions there first of all yes the number of times the file has changed is
number of times the file has changed is
number of times the file has changed is a really really good measure and it’s
a really really good measure and it’s
a really really good measure and it’s actually backed by empirical research
actually backed by empirical research
actually backed by empirical research the number of times a module has changed
the number of times a module has changed
the number of times a module has changed is a better predictor than any other
is a better predictor than any other
is a better predictor than any other metric you can mine from the code but
metric you can mine from the code but
metric you can mine from the code but still it’s say of course important
still it’s say of course important
still it’s say of course important because there may be a huge differences
because there may be a huge differences
because there may be a huge differences in commits styles between offers on a
in commits styles between offers on a
in commits styles between offers on a project and what a typically recommend
project and what a typically recommend
project and what a typically recommend is that you try to use a uniform commit
is that you try to use a uniform commit
is that you try to use a uniform commit style if you cannot find that there’s a
style if you cannot find that there’s a
style if you cannot find that there’s a alternative metric that you can use
alternative metric that you can use
alternative metric that you can use called code churn so instead of
called code churn so instead of
called code churn so instead of calculate the number of commits you
calculate the number of commits you
calculate the number of commits you calculate the amount of code that has
calculate the amount of code that has
calculate the amount of code that has changed and that completely removes the
changed and that completely removes the
changed and that completely removes the bias at possibly introduced by commits
bias at possibly introduced by commits
bias at possibly introduced by commits and that’s how you extract your metrics
and that’s how you extract your metrics
and that’s how you extract your metrics yes it is I use those two and I tend to
yes it is I use those two and I tend to
yes it is I use those two and I tend to again stick with the number of commits
again stick with the number of commits
again stick with the number of commits if it’s if possible because it’s such a
if it’s if possible because it’s such a
if it’s if possible because it’s such a simple metric it’s so intuitive to
simple metric it’s so intuitive to
simple metric it’s so intuitive to reason about it so and the last question
reason about it so and the last question
reason about it so and the last question here can you recommend a good place in
here can you recommend a good place in
here can you recommend a good place in the town centre to a party after
the town centre to a party after
the town centre to a party after go to party yeah I’ll know a lot of
go to party yeah I’ll know a lot of
go to party yeah I’ll know a lot of really good pups down in the south of
really good pups down in the south of
really good pups down in the south of Sweden where I come from that won’t help
Sweden where I come from that won’t help
Sweden where I come from that won’t help you so now sorry okay I delegate you
you so now sorry okay I delegate you
you so now sorry okay I delegate you that yeah well I’m not sure I could help
that yeah well I’m not sure I could help
that yeah well I’m not sure I could help you either Thanks
Be First to Comment