Gaming the IDE


So people have a lot of ideas about how software should be written; how code should look; what should be tested; and just about everything else.

We also know we can measure most of these things, cyclomatic complexity, code coverage, method length, methods per interface, and just about anything else. As we are talking about code we know that we can measure most of this, fxcop, and many source code visualization tools are out there.

These tools sometimes are also used in reference to automated build tools.

But they don’t yet seem to make it into to the IDE at the point of check in.

What would be the effect if a developer had a small message that popped up upon check-in that gave a score for the changes that they were about to commit. What would happen if there was a leader board? How would a code base change, and how fast would it happen?

There are some important questions to answer before you turn this on in a company. First and foremost make sure that you award things you actually want to see. And lastly, never underestimate the ability of people to game a system; they will find the loophole and they will crawl through it and they will likely get upset if you tell them that what did was against the spirit of the game.

So maybe this isn’t a good idea, but it is still interesting to muse on the concept.

Posted in Software Engineering | Tagged , , | Leave a comment

The Builder analogy is right


So there has been some critique on the builder analogy.

And a lot of it is deserved.

But one place where it is right; and not often referenced is how the builder comparison applies to current builders.

Current builders are not really what software likes to compare itself to; the software profession prefers to have a more romanticized view of builders. The comparison tends to be done with the builders tends to be done with builders from the middle ages. Where there were craftsmen, guilds, architects and cathedrals were being build. Not with the current builders where apartment buildings are being put up in under 6 months by the cheapest labor around and a result that looks very similar to a building 3 blocks over.

So lets looks at current builders

  • Chances are that everything is subcontracted from the HVAC, to architecture to cabinets & floors
  • Chances are that the builder, and almost everyone that is involved in the project is not an employee of the final occupant
  • Chances are that the building is constructed from many prefabricated components
  • Chances are that the building is created from a relatively standard architecture

So lets compare that to current software construction

  • Chances are that everything is done in house
  • Chances are that the developers are an employees of the builder
  • Chances are that the project is constructed from many custom made components
  • Chances are that the projects is created from a custom home made architecture

Why did builders switch from this craftsmen mentality to a subcontracting mentality? Not only that, why did builders make this change while it is much harder for construction to make this switch than for software. First of all buying sub components still requires raw resources for each unit bought; no matter how many cabinet doors you buy, the cost can never fall below the cost of the wood it takes to make the door. Whereas for developers the cost of replicating sub components can approach 0 when volume is high enough. The cost of replicating an architecture is already near 0 as most can be found on the internet at no cost.

An interesting side note is that one part of the industry does seem to follow the concept of current builders. And surprisingly it is the game industry, the game industry. Just read a current project and see all the different studios that do small pieces of a big project on behalf of a publisher. Some studios do level design, another renders hair and a following one provides a physics engine. And the funny thing is; we think it’s normal.

Posted in Software Engineering | Tagged , | Leave a comment

On Craftsmen


People have always had the most interesting problem of what developers are.

One of the first attempts I know of has been the Mythical Man-Month, it communicated the concept that some developers had 10 times the output of other developers. Ten times the output is completely unheard of in manufacturing, really in most of the engineering fields. This idea spawned the idea that some developers must be artist, since that is an area there the 10 to one ration makes great sense. There are after all many artists, and we only hear of a few, there are the greats and the rest.

The artist concept is interesting but completely unmanageable for an industry.

  • How can you price a developer who is ten times the value of an average developer?
  • What would you do when the developers catch on?
  • How can you replace these types of developers ?

Industry is there to make things, so the search was on for a way to make developers.

So we move on to the Pragmatic Programmer, and the introduction of the craftsman. Craftsmen seems much more manageable, craftsmen can be trained. The concept that there is a core body of knowledge and any person who masters it becomes a developer. Sure talent is in there somewhere but the important thing is the technologies. Software engineering was born, organizations sprang up, acronyms were created, levels defined and a bright new path was set that allowed organizations to tame their software problems.

That seems to be where we are at these days; Developers are craftsmen who care about honing their trade to perfection. If you want better results adopt these technologies and it will happen; and if it fails then you probably didn’t implement enough or did it wrong. Go to a presentation on Agile, Scrum, or some other methodology and they will all tell you that the only people that they knew that abandoned the technology were those that didn’t implement it right. The good news is that the guy presenting is probably a consultant who can help you implement them right.

And this idea is convenient for the developers as well, it gives them this nostalgic concept of being the creators. Part of a profession in the making, there are organizations springing up to certify developers. The rules are just being created and developers are eager for certifications and titles, for all that aspire want to one day be called the architect.

It seems that the craftsmen metaphor is working, and the development industry is rapidly shaping itself to fit it’s new mold.

Although there are some gaps for starters engineering disciplines are very rarely in a position where measurements are a massive problem. Like how do you measure the output of a developer, how do you predict the output of an architecture, how do you quantify the difference between two architectures. Sure we can call this part of being a new discipline but it goes a little bit deeper. And that is weird, everything we create is already in a format that is friendly for measurements, it is even discrete.

But measuring fails. We can measure lines of code, but keeping all else equal we prefer less to more. Less is easier to comprehend, easier to maintain, likely faster to write, likely less buggy and in the end cheaper. That is why we use higher level languages, abstractions, code generation and most other popular techniques. The same goes for features, more buttons does not make an application more valuable.

Another issue comes from the inherit problem of the craftsmen concept. It goes as follows. Envision a craftsman. What is he/she doing?

They are probably making something.

And that is the problem. If you ask a craftsman for something their response is to make something. The question rarely matters much as the craftsmen focus on it as an opportunity to make something. And quite often this creates the problem. No one is complaining that developers are not creating things, but they are complaining that they are creating the wrong thing.

It is the beautiful monster that was created when the developers all got convinced that they were craftsmen and started to act accordingly. They are creators, and the creation is done because that is what defines them. They learn how to create more, better, faster and sometimes closer to the right thing. After all, if it was wrong it was probably bad requirements.

Requirements are funny, it is like an insurance policy against being wrong. We have some poor bastards write it out before hand and then the developers have a shield to hide behind. Sure some methodologies are trying to get the requirements out of the picture, or at least more flexible. But this is still done under the guise of being craftsmen and creating a more perfect creation. And that is why it doesn’t work all that well.

As long as developers are considered craftsmen the most important element of software development will stay in the background.

The customer.

That is because craftsmen and customers don’t traditionally meet. It is the sales people that are the bridge. The craftsmen create, and the sales people find those who’s needs the product meets.

In software this is not really as easy, for consumer software there is a lot more competition, and for industry software there is often only one customer. So for success to be possible the product has to suit the customers quite closely, and then the sales people can do the rest.

To bridge this gap there is the introduction of the BA; and that would be the poor bastards that get to play the telephone game between the developers and the customers in a desperate attempt to make the product suit the customer. They get to be the glue between customers and development. And as long as they are there developers won’t have to take ownership in front of a customer.

But let’s imagine what would happen if tomorrow all BA’s get fired (no, I don’t hate all BA’s)

The only way that companies would be successful is if the developers start talking to the customers, and that the developers become responsive to the customers’ needs.

The only way that organizations will survive is for developers to become service sector workers.

 

 

Oh, and when we assume that developers are service sector workers measuring becomes pretty easy. I’m pretty sure everyone has filled out some sort of customer satisfaction survey before and that seems to be working pretty well for the industry.

 

Posted in Possum Labs, Software Engineering | Tagged | 2 Comments

On Data


On Data

I’m not a DBA, I’m also not a data architect. But I spend a lot of my time in databases; so this is written from a developers perspective.

The three stages of data

It seems that data lives in three stages, first the data is raw, it then gets normalized and finally Aggregated.

Raw data tends to be de-normalized and segregated by source, and gets updated. The source of this data could be a data feed, a service, user activity on the site, or sensors. This data tends to make up the majority of the records in databases that you’ll find in applications. And in many cases one of the data sources will be the system you are implementing.

The second stage of data, the Normalized Data, is where it gets normalized. This is where the different data sources come together and the data gets normalized. This is where records from different sources get linked together by business rules and where mappings occur between enumerations of different sources.

The third stage of the data, the Summarized Data, is where data gets summarized and molded for consumption. An example of this would be when title, first name, last name get combined into a display name. This is where the data for an account is summarized or content ratings get computed.

There once was a database in a company far, far away ..

Most systems will have examples of all of these concepts, but the concepts are comingled in the database. A common pattern tends to be that people take their first data feed and make that the database.  The application retrieves the data trough stored procedures that aggregate the data.

Then a second data source comes along.

The second data source will offer some data that the first data source doesn’t offer; so the data model gets augmented to support the second source. A few columns get added to the existing model to support the new data and the new data gets merged in with the original data. The procedures get updated to account for the changes in the model, and the things get up and running.

So far so good, or at least so it seems. There was something lost; the system is now in a transient stage. Depending on the order of the arrival of the data sources the behavior would be different. The system has also an all or nothing approach, when a data source is added the entire system needs to be aware of this. The data source will impact the procedures that retrieve data for the UI as well as impacting the data load for the initial source.

So let’s look a bit further in the life cycle of the application. More sources are added, more data is added, there are more users and they are complaining that the system is getting “slow”. Load drives the next generation changes, the summarization of data. Some complex common queueries get precomputed, some times in their own tables, some times as columns added to existing tables. The high usage areas of the application get updated to use the new queueries.

And this brings us to the point where most applications end up, with a database in a transient state and production support DBAs getting midnight calls as the order of data sources manage to get themselves in a new and interesting state.

Conclusion

Most systems will fall in a range between a segregated approach and an integrated approach. And of the systems that I’ve worked on the more segregated systems were the ones that were easiest to work on. Some of the benefits that resulted from this segregation were as follows.

  • All normalized data and aggregated data could be deleted and recomputed. When the business logic for aggregation on normalization changed it had only change in one place and there was no “legacy” data.
  • There less code, although much more data. If any information had to be computed it was precomputed and stored in a table. The procedures to retrieve data were simple joins and no data manipulation was needed. In the transition from a classic system to a segregated system we reduced the lines of sql by more than 60%.
  • The system is fast, everything is a simple select with a few possible joins
  • The system was easy to learn and fairly self documenting. Each summary table has procedure(s) that populates it, the data retrieval procedures build upon the summary tables.
  • Easy to verify, the system can be inspected in each of the stages to you only need to verify small pieces of logic at a time.

As for the drawbacks

  • This system will increase data size between X 2 and X 5, table lengths are not impacted. All data is copied once to the normalization tables, and possibly to the summary tables
  • The system does not accommodate systems with frequent updates. Each update cascades to the normalized and summary tables. It is ideal for systems that have a concept such as batch or nightly imports.

Considering the reduction in the cost of redundant storage, and the advent of big table solutions like seen in many could solutions the major deficit of the solution (storage) will be negated. The secondary deficit is still applicable.

Posted in Architecture | Tagged , , , | Leave a comment