politicking software gardener
160 stories
·
2 followers

Towards a theory of software development expertise

3 Shares

Towards a theory of software development expertise Baltes et al., ESEC/FSE’18

This is the last paper we’ll be looking at this year, so I’ve chosen something a little more reflective to leave you with (The Morning Paper will return on Monday 7th January, 2019). The question Baltes and Diehl tackle is this: “How do you get better as a software developer?” What does expert performance look like?

We present a first conceptual theory of software development expertise that is grounded in data from a mixed-methods survey with 335 software developers and in literature on expertise and expert performance…. [the theory] describes central properties of software development expertise and important factors influencing its formation.

In essence, ask a bunch of practitioners what they think, use a disciplined coding scheme to interpret the answers (a “grounded theory”), and then layer in what we know about expertise and expert performance in general. The end result is a “conceptual theory” that shows the various contributors to expert performance and the relationships between them. “Software Development” in the current work is synonymous with “programming.”

To make the paper come alive you need to engage with it a little: Does the theory developed by the authors make sense to you? What’s missing? How would you weight the various factors? How could you apply this on a personal level in 2019? How could this be applied in your team or organisation to raise the collective level of expertise next year?

Software developers can use our results to see which properties are distinctive for experts in their field, and which behaviors may lead to becoming a better software developer…. Employers can learn what typical reasons for demotivation among their employees are, and how they can build a work environment supporting the self-improvement of their staff.

A grounded theory

The first phase involved sending a questionnaire to users active on both GitHub and StackOverflow between Jan 2014 and October 2015. The questionnaire was sent to 1,000 individuals, and received 122 responses.


(Enlarge)

The grounded theory (GT) coding exercise was then used to generate a theory from the qualitative data:

… the process of coding assigns “summative, salient, essence-capturing” words or phrases to portions of the unstructured data. Those codes are iteratively and continuously compared, aggregrated, and structured into higher levels of abstractions, the categories and the concepts. This iterative process is called constant comparison.

(Aside: it strikes me that the body of work on grounded theory development might be very interesting to study from the perspective of domain-driven design and the building of a ubiquitous language.)

After much distillation, the model comes out looking like this:

The grounded theory describes software development expertise as a combination of a certain quantity and quality of knowledge and experience, both general and for a particular language. The work context, behavior, character traits, and skills influence the formation of expertise, which can be observed when experts write well-structured, readable, and maintainable source code.

You’ll know an expert programmer by the quality of the code that they write. Experts have good communication skills, both sharing their own knowledge and soliciting input from others. They are self-aware, understanding the kinds of mistakes they can make, and reflective. They are also fast (but not at the expense of quality).

Experience should be measured not just on its quantity (i.e., number of years in the role), but on its quality. For example, working on a variety of different code bases, shipping significant amounts of code to production, and working on shared code bases. The knowledge of an expert is T-shaped with depth in the programming language and domain at hand, and a broad knowledge of algorithms, data structures, and programming paradigms.

A preliminary conceptual theory

The next phase was to take the grounded theory and embed it within the existing literature on expertise and expert performance, for which the main resource used was ‘The Cambridge Handbook of Expertise and Expert Performance’.

This handbook is the first, and to the best of our knowledge most comprehensive, book summarizing scientific knowledge on expertise and expert performance.

The result of this process is a preliminary conceptual theory that looks like this:

Acquiring expertise is not exclusively a cognitive matter, personality and motivation influence behaviours that may or may not lead to improvements of expertise. The work context, including team members, managers, and customers, can also influence the behaviour of a developer, and this can also vary according to the type of task being undertaken.

Reaching true experts levels requires deliberate practice combined with monitoring, feedback, and self-reflection.

Deliberate practice

Having more experience with a task does not automatically lead to better performance. Research has shown that once an acceptable level of performance has been attained, additional “common” experience has only a negligible effect, in many domains the performance even decreases over time. The length of experience has been found to be only a weak correlate of job performance after the first two years.

Deliberate practice is required to become an expert: prolonged efforts to improve performance while continuously increasing the difficulty and centrality of development tasks.

…studies have shown that deliberate practice is necessary but not sufficient to achieve high levels of expert performance— individual differences also play an important role.

Monitoring, feedback, and self-reflection

Deliberate practice requires a way of monitoring performance, which could be e.g. from a teacher, coach, mentor, or peer: “the more channels of accurate and helpful feedback we have access to, the better we are likely to perform.“. Monitoring and self-reflection also influence motivation and consequently behaviour.

The full conceptual theory

For the third and final phase the authors sampled two additional programmer populations, active Java developers, and very experienced developers, with the goal of further elaborating and refining the categories and relationships in the theory.

The final resulting model looks like this:


(Enlarge)

The most frequently cited tasks that an expert should be good at were designing software architecture, writing source code, and analysing and understanding requirements. Within the software architecture task, understanding modularisation and decomposition were frequently mentioned.

In terms of personality traits, experts should be open minded and curious, be team players, and pay attention to detail. Patience and self-reflection were also cited. In terms of general skills, “problem solving” came top of the list under which analytical thinking, logical thinking, and abstraction/decomposition all feature. Another important skill is being to assess trade-offs.

Mentors should be guiding, patient, and open-minded. Participants were most motivated by mentors that posed challenging tasks.

To facilitate continuous development of their employee’s software development skills, (employees suggested that) employers should:

  1. Encourage learning (e.g. training courses, conference attendance, and access to a good analog or digital library)
  2. Encourage experimentation (e.g. through side projects and by building a work environment that is open to new ideas and technologies)
  3. Improve information exchange between development teams, departments, and even companies. E.g. lunch and learn sessions, rotation between teams, pairing, mentoring, and code reviews.
  4. Grant freedom (primarily in the form of less time pressure) to allow developers to invest in learning new technologies or skills.

In contrast, non-challenging or routine tasks result in demotivation. Other causes of performance decline over time are lack of a clear vision or direction, absence of reward for quality work, stress in the work environment, and bad management or team structure.

Your turn

How will you ensure that in 2019 you grow your expertise, and not simply add another year of (the same or similar) ‘experience’ ?

See you in January! Thanks, Adrian.



Read the whole story
kouk
34 days ago
reply
Athens, Greece
Share this story
Delete

mcmansionhell: justice babey, hell yeah (h/t JacobDisagrees on Twitter)

1 Share

mcmansionhell:

justice babey, hell yeah

(h/t JacobDisagrees on Twitter)

Read the whole story
kouk
37 days ago
reply
Athens, Greece
Share this story
Delete

The Terrifying Moment at the Congressional Google Hearing Today

1 Share

Hits: 42

During a radio interview a few minutes ago, I was asked for my opinion regarding Google CEO Sundar Pichai’s hearing at Congress today. 

There’s a lot that can be said about this hearing. Sundar confirmed that Google does not plan to go ahead with a Chinese government censored search engine — right now. 

Most of the hearing involved the ridiculous, false continuing charges that Google’s search results are politically biased — they’re not.

But relating to that second topic, I heard one of the scariest demands ever uttered by a member of the U.S. Congress.

Rep. Steve King (R-Iowa) wants Google to hand over to Congress the identities of the Googlers whose work relates to search algorithms. King made it clear that he wants to examine these private individuals’ personal social media postings, his direct implication being that showing a political orientation in your personal postings would mean that you’d be incapable of doing your work on search in an unbiased manner.

This is worse than wrong, worse than stupid, worse than lunacy — it’s outright dangerous McCarthyism of the first order.

Everything else that occurred in that hearing pales into insignificance compared with King’s statement.

King continued by threatening Google with various punitive actions if Google refuses to agree to his demand regarding Google employees, and also to turn over the details of how the Google search algorithms are designed — which of course Congress would leak — setting the stage for search to be gamed and ruined by every tech-savvy wacko and crook.

Steve King has a long history of crazy, racist remarks, so it’s no surprise that he also rants into straitjacket territory when it comes to Google as well.

But his remarks today regarding Google were absolutely chilling, and they need to be widely and vigorously condemned in no uncertain terms.

–Lauren–

Read the whole story
kouk
43 days ago
reply
Athens, Greece
Share this story
Delete

Github Organization as a Code (with Terraform)

1 Share

Terraform provides an easy way to define, organize and version all kind of resources and permissions for Github, as well as recreate organization structure from scratch any time.

https://medium.com/@yurinnick/github-organization-as-a-code-29da7efe3086

submitted by /u/yurinnick
[link] [comments]
Read the whole story
kouk
46 days ago
reply
Athens, Greece
Share this story
Delete

CPS May Both Over- and Underprotect

1 Share
Walter Olson

Before entering onto my disagreements with Prof. Dwyer, here are a few items on which he and I do agree in part or full. I think he makes a good point that if courts or lawmakers restrict agencies’ use of soft or intermediate sanctions such as safety plans, they will often turn to harder methods. Moreover, a push to fit soft or intermediate sanctions into a more legalistic framework, while generating some results I might applaud – such as a better audit trail by which we could check agencies’ use of the sanctions – might also have other less welcome effects, such as cost, delay, and hazards to privacy as more family details get inscribed in permanent public records. Finally, I believe I agree with both Redleaf and Dwyer that caseworkers are placed under conflicting and difficult demands, while being asked to exercise judgment for which training is (and maybe always will be) inadequate. No matter how much we may fear the power of CPS agencies, demonizing caseworkers as a group is not the right answer.

Dwyer thinks it significant enough to make the point twice that “the rate of actual child maltreatment greatly exceeds the rate at which children are reported to CPS as maltreated.” I think this point is not worth making even once.

A quarter century ago, there was a famous study on medical negligence in which experts reviewed a large random sample of treatment records to determine how often substandard care had harmed patients. They found a lot of bad, injurious care – in fact many more instances of it than there were lawsuits. In other words, most times that doctors committed negligent harm, they were not sued. As you can imagine, trial lawyers crowed about that part of the study. But they did not crow about the other part, in which the check was done in the reverse direction: looking at the cases where doctors were sued, the reviewers in the great majority of instances did not find substantiation of negligent harm. (Many of the unsubstantiated claims nonetheless obtained settlements.) The combined findings of error in both directions do not somehow balance out to show that the malpractice-suit system was working well as a whole. Quite the reverse.

Dwyer urges us to draw inferences to parents’ disfavor from missing evidence. “A caseworker’s conclusion that a report is unfounded,” he writes, “does not amount to a determination that the report was invalid or false.” Maybe so, but how much less does it entitle us to proceed as if the parent was probably guilty but contrived to get away with it this time? As Diane Redleaf rightly recognizes, any legal system deserving the name of justice must distinguish sharply between accusation and proof, most especially when the subject matter of accusations is read by society as a matter of deep disgrace and moral stain, criminal liability or no. To me, Dwyer’s inquiry into the details of the Hernandez case shows how easy it is for family members to set off tripwires of suspicion under questioning. Inconsistencies and errors were found in parental accounts on such matters as who was present at a scene and whether there were objects in a crib. It’s not exactly news that witnesses after an incident often give confused and contradictory accounts. If we are going to entrust caseworkers with powers of on-the-spot family separation based on subjective reception of demeanor evidence, a spider sense of something just not seeming right – at least when an “unexplained bruise” figures into the mix – then I hope at least we are duly awed and humbled at the formidable nature of the discretion we are entrusting to caseworkers over human lives.

As I mentioned in my earlier comment, there are agencies willing, as policy, to snatch children from parents over marijuana use in the home, over letting Junior sit in the back seat while Mom picks up the dry cleaning, over playing alone in the park at age 8, and over a host of other infractions within past or present normal range. Ten years from now, maybe the triggers will be cigarette smoking in kids’ presence, moderate drinking during pregnancy, or a snack-food-based diet. Being popped into the care of paid strangers through multiple and shifting placements may involve getting yanked into a different school system, losing touch with your old friends, and crying yourself to sleep each night from missing your real family – but never mind, agencies record a low rate of formal abuse findings in situations like yours. Above all when shifting policy and value judgments get framed in the language of claims to expertise, families fear CPS, and they are right to fear CPS.

I will note for the record Prof. Dwyer’s at best puzzling statement that government “creates legal relationships between children and persons who wish to serve as parents, whether that occurs in an adoption proceeding or via biology-based parentage law.” Pending adoptions are one thing, but I would have called actual biological mothers and fathers, as well as parents after completed adoptions, not “persons who wish to serve as parents,” but simply “parents.” To claim that by not intervening to destroy an established family the state is “continually, albeit implicitly, reaffirming its choice of legal parents” is to imply an astounding subservience of the family to the state. And to state that when it detects “a substantial divergence” between parents’ and children’s interests, the state “should act to protect the child, period,” is as wrong as can be, too. A parent’s unwise choice of a partner in remarrying after divorce, for example, quite clearly will often conflict with the child’s interests. That does not mean the state should have the slightest say in the matter.

In perspective, my differences with Diane Redleaf’s views seem small. And I do wonder, with her, why it is that parents facing the seizure of their children do not already have an established right to be informed promptly of the nature of the charges against them.

Read the whole story
kouk
57 days ago
reply
Athens, Greece
Share this story
Delete

Unikernels as processes

2 Shares

Unikernels as processes Williams et al., SoCC’18

Ah, unikernels. Small size, fast booting, tiny attack surface, resource efficient, hard to deploy on existing cloud platforms, and undebuggable in production. There’s no shortage of strong claims on both sides of the fence.

See for example:

In today’s paper choice, Williams et al. give us an intriguing new option in the design space: running unikernels as processes. Yes, that’s initially confusing to get your head around! That means you still have a full-fat OS underneath the process, and you don’t get to take advantage of the strong isolation afforded by VMs. But through a clever use of seccomp, unikernels as processes still have strong isolation as well as increased throughput, reduced startup time, and increased memory density. Most importantly though, with unikernels as processes we can reuse standard infrastructure and tools:

We believe that running unikernels as processes is an important step towards running them in production, because, as processes, they can reuse lighter-weight process or container tooling, be debugged with standard process debugging tools, and run in already virtualized infrastructure.

So instead of a battle between containers and unikernels, we might be able to run unikernels inside containers!

Unikernels today

Unikernels consist of an application linked against just those parts of a library OS that it needs to run directly on top of a virtual hardware abstraction. Thus they appear as VMs to the underlying infrastructure.

In the case of Linux and KVM, the ukvm monitor process handles the running of unikernels on top of KVM. (Like a version of QEMU, but specialised for unikernels). The ukvm monitor is used by several unikernel ecosystem including MirageOS, IncludeOS, and Rumprun.

The userspace ukvm monitor process handles setup (e.g. allocating memory and virtual CPU, opening file descriptors) and exit. During execution, the unikernel exits to the monitor via hypercalls, usually to perform I/O.

When using virtualization technology, isolation is derived from the interface between the unikernel and the monitor process. This interface is this: the unikernel can exit to ukvm via at most 10 hypercalls.

The argument for enhanced security (isolation) comes from the narrowness of this interface. If an attacker did break through the interface into the monitor process, then of course they would then be able to launch attacks across the entire Linux system call interface. For comparison, Linux has over 300 system calls, and Xen has about 20 hypercalls.

Unikernels as described above have some drawbacks though: there is no guest OS, so no familiar debugging tools. Memory density can suffer as all guests typically perform file caching independently; and every hypercall involves a context switch, doubling the cycles consumed compared to a direct function call. Finally, when an existing infrastructure-as-a-service offering is used as a base on which to build higher-level offerings (e.g. serverless platforms), then unikernels can’t be deployed without nested virtualisation, which is itself difficult and often not supported.

Unikernels as processes

When using a unikernel monitor as described above, unikernels are already very similar to applications.

A running unikernel is conceptually a single process that runs the same code for its lifetime, so there is no need for it to manage page tables after setup. Unikernels use cooperative scheduling and event loops with a blocking call like poll to the underlying system for asynchronous I/O, so they do not even need to install interrupt handlers (nor use interrupts for I/O).

If we think of a unikernel in this way as a specialised application processes, then maybe we can get the same narrow interface afforded by the exposed hypercalls of ukvm in some other way….

Many modern operating systems have a notion of system call whitelisting allowing processes to transition into a mode with a more restricted system call interface available to them. For example, Linux has seccomp.

The traditional difficulty with seccomp is figuring out the set of system calls that should be allowed. E.g., Docker runs containers under a default policy that allows them to perform more than 250 system calls. When we have a unikernel as our process though, we can lock the set of calls right down.

So that’s how unikernels as processes work. In place of the ukvm monitor there’s a component the authors call a tender. The tender is responsible for setup and exit handling as the ukvm monitor is. However, once file descriptors etc. are setup, the tender dynamically loads the unikernel code into it’s own address space. Then it configures seccomp filters to only allow system calls corresponding to the unikernel exits for I/O, and makes a one-way transition to this new mode. Finally it calls the entry point of the loaded unikernel.

… the unikernel executes as usual, but instead of invoking hypercalls when necessary, the unikernel simply does a normal procedure call to the hypercall implementation in the tender. The hypercall implementation in the tender is identical to the monitor implementation in the virtualization case; the tender will likely perform a system call to Linux, then return the result to the unikernel.

The table below shows the mapping from hypercalls to system calls for ukvm-based unikernels.

Isolation properties come from the interface between the tender and the host, since the tender becomes the unikernel when it first jumps to guest code. Seccomp filters allow the tender to specify file descriptor associations at setup time, and other checks such as constraints on the arguments to blkwrite can also be specified.

We therefore consider unikernels as processes to exhibit an equal degree of isolation to unikernels as VMs.

By being processes, unikernels as processes can also take advantage of ASLR, common debugging tools, memory sharing, and architecture independence for free.

Introducing nabla

Solo5 provides a unikernel base layer upon which various library OSes run such as MirageOS, IncludeOS, and Rumprun. ukvm is distributed with Solo5. The authors extend Solo5 and ukvm by adding a new backend to ukvm enabling it to operate as a tender, and changing the Solo5 ukvm binding to eliminate some hardware specific setup and use a function call mechanism rather than a hypercall mechanism to access the tender. The resulting prototype system is called nabla. With that, we’re good to go…

Evaluation

Using a variety of workloads, the authors explore the isolation and performance characteristics of nabla. The evaluation compares unikernels running under ukvm, nabla, and vanilla processes.

Nabla runs the systems using 8-14 times fewer system calls than the corresponding native process:

It also accesses about half of the number of kernel functions accessed by ukvm unikernels (mostly due to nabla avoiding virtualization related functions).

Using a fuzzer, the authors also explored how much of the underlying kernel is reachable through the nabla interface. Compared to a seccomp policy that accepts everything, nabla reduces the amount of kernel function accessible by 98%.

Nabla also achieves higher throughput than ukvm in all cases:

Running unikernels as processes increases throughput by up to 245%, reduces startup times by up to 73%, and increases memory density by 20%. It all adds up to an exciting new set of options for unikernel projects and deployments.

The last word

In this paper, we have shown that running unikernels as processes can improve isolation over VMs, cutting their access to kernel functions in half. At the same time, running unikernels as processes moves them into the mainstream; unikernels as process inherent many process-specific characteristics— including high memory density, performance, startup time, etc— and tooling that were previously thought to be necessary sacrifices for isolation. Going forward, we believe this work provides insight into how to more generally architect processes and containers to be isolated in the cloud.



Read the whole story
kouk
58 days ago
reply
Athens, Greece
Share this story
Delete
Next Page of Stories