Thursday, September 6, 2012

Unix lover: snippet 1.

I love Unix so much... I had this latex file and I wrote the names of the files of the intended chapters to be included in the mainfile. Then I realized that I had to basically write all the filenames or do some crappy copy and paste.

However...

cat <<EOF | sed -e's/\\input{\(.*\)}/\1.tex/' | xargs touch
\input{intro}
\input{cha1}          
\input{cha2} 
\input{cha3}
\input{cha4}  
\input{cha5}
\input{conclusions}
EOF

Wednesday, September 5, 2012

Models of human skills: gaussian, bimodal or power-law?

The first thing I have to say is that this post actually contains my own reflections. I have not found clear scientific facts that actually prove these ideas wrong or right. On the other hand, if you have evidence (in both directions) I'd be glad to have it.

On of my first contacts with probability distributions was probably around 7-8th grade. My maths teacher was trying to explain to us the concept of gaussian distribution, and as an example, he took the grade distribution. He claimed that most of the students get average results and that comparatively fewer students got better or worse grades (and the more "extreme" grades were reserved for a few).

I heard similar claims from many teachers of different subjects and I have not been surprised. It looks reasonable that most people are "average" with fewer exceptionally good or exceptionally bad exemplars. The same applies to lots of different features (height, for example).


201209051915.jpg

Lots of years later, I read a very interesting paper claiming that, instead the distribution of programming results usually has two humps. For me, the paper was sort of an epiphany. In fact, my own experience goes in that direction (but here, I may be biased because I have not collected data). The paper itself comes with quite lot of data and examples, though.

So apparently there are two models to explain the distribution of human skills: the bell and the camel. My intuition is that for very simple abilities, the distribution is expected to be a gaussian. Few people have motivation to become exceptionally good, but the thing is simple enough that few remain exceptionally bad either. On the other hand, sometimes there is simply some kind of learning barrier. It's like a big step that not everybody is able to make. This is common for non-entirely trivial matters.

In this case, the camel is easily explained. There are people who climbed the step and people who did not. Both groups are internally gaussian distributed. Another example could be an introductory calculus exam. If the concepts of limit and continuity are not clear, everything is just black magic. However, some person may still get decent marks because of luck or a lower level form of intuition. Among those that have the fundamental concepts clear there is the usual gaussian distribution.

However, both models essentially assume that the maximum amount of skill is fixed. In fact, they are used to fit grades, that have, by definition, a maximum and a minimum value. Not every human skill has such property. Not every human skill is "graded" or examined in a school like environment. And even for those that actually are, usually grades are coarse grained. There is probably a big difference between a programming genius (like some FOSS super stars) and a regular excellent student. Or between, say, Feynman and an excellent student taking full marks. The same thing may be even more visible for things usually perceived as artistic. Of course, it is hard to devise a metric that can actually rank such skills.

My idea is that for very hard skills, the distribution is more like a power law (at least in the tail). Very few people are good, most people are not even comparable to those and the probability to find someone twice as good as a very good candidate is just a fraction.

Just to conclude, I believe that for very simple skills most of the time we have a gaussian. If we have "learning steps" or some distinctive non-continuous feature that is related, then we have multi-modal distributions (like the camel for programming: those that are able to build mental models of the programs are successful, those that only use trial and error until the programs "compile" are unsuccessful). Eventually, for skills related to exceptionally hard (and challenging and gratifying) tasks we have power-law distributions. May we even use such distributions to classify the skills and the tasks? And in those cases, a gaussian distribution is a good thing or not?

So that, given someone that is able to make mental models of programs, maybe Haskell programming remains multi-modal, because there are some conceptual steps that go beyond basic programming, while Basic is truly gaussian? Is OOP gaussian? And FP?

And in OOP languages, what about static typing? Maybe dynamic typing is a camel (no, not because of Perl) because those that write tests and usually know what they are doing fall in the higher hump, while those that just mess around fall in the lower hump. Ideas?

Sunday, August 26, 2012

Does Object-Oriented Programming really suck?

I recently read Armstrong essay on how much OOP sucks. While such opinions would probably have considered pure blasphemy a few years ago, nowadays they are becoming more popular. So, what’s the matter with OOP?

The first thing that comes to mind is that OOP promised too much. As already occurred to other paradigms and ideas (e.g., Logic Programming and Artificial Intelligence) sometimes researchers promise too much and then they (or the industry) cannot stay true to their promises.

OOP promised better modularity, better code-reuse [0], better “whatever”. Problem is, bad developers write crap with every single programming paradigm out there. With crap I really mean “utter crap”. A good programmer using structured programming and a bad programmer using OOP, is that the good programmer’s structured program would probably look “more object oriented” than the other one.

Another problem occurred in the OOP community: OOP “done right” was about message-passing. Information Hiding, Encapsulation and the other buzz-words are a consequence of the approach (because object communicate with messages, their internal state is opaque) not a goal of OOP. Objects are naturally “dynamic” and the way we reason about code is separating concerns in objects having related concerns and having the objects communicate in order to achieve the task at hand. And I’m a bit afraid of talking about “OOP done right”: I really just have my vision. OOP is very under-specified, so that it becomes very hard to criticize (or defend) the approach.

However, OOP becomes “data + functions”. To me data is simply *not* part of OOP. It’s an implementation detail of how objects maintain state. As a consequence, I do not really see data-driven applications as good candidates to OOP. Once again, OOP was sold as universal. It is not. Consider the successes that functional programming is achieving (performance and concept-wise) in this area.

This “data + functions” comes from C++ being the first commercially successful OOP platform. The great advantage was that the transition from structured programming to OOP was quite more easier, that programs could be far more efficient (read “less runtime”) than message passing dynamic OOP systems, at least back in the day. However, there was so much missing from the original idea!

Since back then classes were considered good (and C++ had — and has — no separate concept of interface) + static typing and a relatively verbose language with no introspection, it became rather natural to focus on inheritance. Which later was proved to be a bad strategy. Consequently, there is less experience in building OO software than one would think, considering that for a large part of its history OOP was dominated by sub-optimal practices.

So, what is really bad with OOP? Following Armstrong:

Data structure and functions should not be bound together

Agreed! But I believe that Data Structures are not *truly* part of OOP. Let me explain.

Data is laid out in some way. There is a “physical layout”, for example. You can express that very precisely in C with extensions, to the point of exactly specifying the offsets of the various datas. A phyisical layout also exists in high level languages such as Python. However, it is not part of the Python model. Of course with the struct module or ctypes you can fiddle with it (but they are libraries), however, for the most part, it is just outside the conceptual language used to describe the problem.

Data is also laid out in logical ways. Computer Science defined many data structures and algorithms over them. You can use OOP to express such structures and such algorithms. Still they are not “part” of OOP, they are just expressed in an object oriented way (or maybe not, and remain at structured programming level).

One very good practice in OOP is not using together in the same body of code objects reasoning at different levels. So, for example, you have your low level business logic code that is expressed in terms of data structures, high level business logic expressed in terms of low level business code… and that’s it. Functions is not really together with data. You don’t have “data”. You have some layers of objects.

The point is that OOP is about creating languages at semantic level (you usually do not get to change the syntax). That’s it. If the language is good, well… good. If the language is bad, ok, we’ve got a problem.

Is this suboptimal? In a way, yes. All the indirection may be very expensive. And since abstractions, well… abstract, you may find yourself with a language that is not expressive enough (at least not at the expenses of additional performance costs). Still functions and data structure is not bound together. You just have objects and messages. No “functions” and no “data”. Please notice: this may not be the right thing to do. Still, the problem is not with data + functions, is just related to applying OOP to a specific domain that is ill suited for the approach. OOP is not for every task in the world. But the same objections applies to every system that is built as a stack of layered abstractions.

Please also consider the Qc Na story! Objects are not necessarily opposed to functional programming (you can see them as closures that make different actions available). Objects are just about state + behavior + identity.

Everything has to be an object

And this is something that I don’t think is a problem. It is like saying that Scheme is bad because “everything is a list” (which, strictly speaking, is not true, there are atoms and lambdas etc). If you do not want to program OO, then don’t. If you want, you probably want that everything is an object.

The only issue with everything is an object is, sometimes, performance. From a variety of points of view, such strategy may kill performance. Objects usually have indirections (read pointer). 64 bit pointer for every integer is bad. So we have hybrid stuff like Java and then we have to deal with boxing (either manually in the past or automatically right now). Performance issues can be somewhat “fixed” using proper JIT systems or other optimization techniques. Or providing libraries that do the right thing (see Numpy, for example).

Other than that, I prefer objections in the line of “this thing that should be an object is something else” than those requesting that something that is an object really is. So, I can favorably consider an objection that says that, in Python, “if” is not an object (true). Even though I’m convinced that having if as a statement is not bad either.

And the whole objections with “time”… well, time (and dates) are a bitch to handle. But I find that Python datetime module gets as close to perfection as humanely possible. I really can’t see describing time with a bunch of enums + some structures as an improvement. On the other hand, it looks to me as one of the cases where it is *easy* to see that the OO approach works better.

Consider the related problem of representing a date in a locale. If you introduce a new representation of time, you need either to modify the original function or create a new function (maybe one that works with both things). Creating an interface, on the other hand makes it very easy to introduce new representations of time.

Objection 3 - In an OOPL data type definitions are spread out all over the place


And once again, in OOPL languages data type definitions are not spread out all over the place. They are in C++ and perhaps in Java. And even in Java, if things are done properly, you reason in terms of interface, i.e., in terms of typed messages, not in terms of data structure.
“”“In an OOPL I have to choose some base object in which I will define the ubiquitous data structure, all other objects that want to use this data structure must inherit this object. Suppose now I want to create some “time” object, where does this belong and in which object…
””“
Here it is pretty clear what he has in mind. I think that it is clear that, conceptually, time is an interface and that there is really no need for it to define “data structure”. And if your language is duck typed, the interface is implicit and you have finished before you started.

Objection 4 - Objects have private state


And here I have to agree: state is the root of all evil. Problem is that entirely stateless systems are simply unpractical. We reason in terms of state. We can describe lots of things as stateless, but somewhat we have state. We have files. We have documents. We want them saved and retrieved.
So, we have to deal with state. Agreed. But: 1. imperative programming is about state2. object oriented programming is not, strictly speaking
You can use a “functional” style in OOP, for example. Most of the times, you don’t do it because the code becomes harder to write. But in many situations, on the contrary, you do it because it makes it easier. Often I write immutable objects that cannot be modified: they are not more stateful than a parameter in a function.
Sometimes this is not practical. Ok. And surely using a very stateful style of programming is bad. State in OOP is expressed as behavior. That’s it. Sometimes is done properly, sometimes it is not. Some strategy to deal with state are extremely nice (STM, as an example). Others are not. Also consider how “modern” functional languages really merge concepts from OOP (without being OOP) and how “modern” OOP languages merge concepts from FP (without being functional).
Examples: Python from the OOP side, using generators, lazy evaluation, list comprehensions, closures, etc etc etc.Clojure from the FP side… and all the features it has to express what resembles interfaces, STM and so on…

——

[0] someone once told me they feel lucky when they can actually use the code effectively once, let alone reusing it.

Sunday, August 19, 2012

Java 7 for Mac OS X

And it was about time...

Download Page

Luckily now Java for OS X should not lag to much behind.

Saturday, July 28, 2012

Orwell Dev-C++: Dev-C++ 5.2.0.3 Released

Not a big fan of Windows C++ IDEs.... well I'm not a big fan of Windows, I'm not a big fan of IDEs and I don't particularly like C++ either. As far as I'm concerned, if you really need a C++ IDE for Windows you should have started using CodeBlocks or Eclipse. However, if you really want to stick with Dev-C++, at least grab this new version.

The importance of using a modern C++ compiler should not be overlooked.

Orwell Dev-C++: Dev-C++ 5.2.0.3 Released: Time for another pile of bug fixes. I've also added a few features, like an updated set of built in compiler options and full file path hint...

Monday, July 9, 2012

On language level

When historians started to name periods of time, they roughly divided human history in three main periods: Ancient History, Middle Ages and Modern History. Although there is not general consensus on the actual subdivision (I remember something about Burdach and Burckhardt), that is the general idea.

Modern history essentially starts with the 15th century (again, this is debated) and goes on. Then there is the idea of contemporary history, which is a moving target and regards the last 80 years. I somewhat believe that perhaps we need a better name to refer to what happened in the 20th century that is going to last after it is not contemporary anymore. Still, it is not my job.

Something similar occurred with Art History, where Modern Art goes from 1860 to 1970. Then we have Contemporary or PostModern. So in a few years, a new name will be needed... like Post^2Modern. Don't care.

And we basically have the same trouble. In 1960 Fortran was a high level language. So every non machine level language is a high level language. And we call some languages very high level.
In a few years will we have "overwhelmingly high level languages"?

Monday, May 21, 2012

Social Choice Theory Slides

Worst bug ever: random and concurrent software, but the bug was somewhere else!

Today I finally pinpointed the worst bug I ever met. The bug itself manifested perhaps a week ago when I decided to plot the distribution of a given attribute in a simulator. Essentially, the distribution should have been a power-law but a different behavior occurred. Varying some parameters changed the shape of the curve, but not the fact that it was not a power law.ccdf.png

Here, in red the "expected curve", while the other three colors are the plots of the distributions I found (blue the original one, green and cyan variants with different parameters). The essential problem is that finding the exact cause of this was a hell. This kind of bugs is typically not found with unit-testing. In fact, the very problem is randomized and even mocking the random number generator does not really help. In fact, each individual event occurred apparently "right" and only the whole process showed a wrong behavior.

Since the simulation is concurrent, I spent the first days analyzing if the actual order of execution did somehow cause the problem (the red curve is obtained with a non concurrent model). Turns out that I did things right: event order did not influence the problem.

Eventually, I reviewed the drawing code (in the sense of urns, not in the sense of painting!). After a couple of days I figured out that perhaps I should have mathematically proved that the code implementing the draw had the same distribution of the "red" variant. My wrong assumption was that sampling a graph edge and taking one end-point was equivalent to an extraction from an urn where each node is placed n times, where n is its degree. Essentially the question is that even though the graph was undirected, the computer did ordered the edges end-points.

This is the distribution obtained taking the other endpoint:


ccdf.png

And here, eventually, what we got when I randomize between the two endpoints:

ccdf.png

I just have to rewrite the code implementing the drawing. And I also have to make it fast (my former procedure was quite fast, given the fact that the whole graph is distributed).