Tuesday, August 23, 2016

On Generator Functions, Yield and Return

Here's the question, lightly edited to remove the garbage. (Sometimes I'm charitable and call it "rambling". Today, I'm not feeling charitable about the garbage writing style filled with strange assumptions instead of questions.)

someone asked if you could have both a yield and a return in the same ... function/iterator. There was debate and the senior people said, let's actually write code. They wrote code and proved that couldn't have both a yield and a return in the same ... function/iterator. .... 
The meeting moved on w/out anyone asking the why question. Why doesn't it make sense to have both a yield and a return. ...

The impact of the yield statement can be confusing. Writing code to mess around with it was somehow unhelpful. And the shocking "proved that couldn't have both a yield and a return in the same ... function" is a serious problem.

(Or a seriously incorrect summary of the conversation; a very real possibility considering the garbage-encrusted email. Or a sign that Python 3 isn't widely-enough used and the emil omitted this essential fact. And yes, I'm being overly sensitive to the garbage. But there's a better way to come to grips with reality and it involves asking questions and parsing details instead of repeating assumptions and writing garbage.)

An example


>>> def silly(n, stop=None):
 for i in range(n):
  if i == stop: return
  yield i

  
>>> list(silly(5))
[0, 1, 2, 3, 4]
>>> list(silly(5, stop=3))
[0, 1, 2]

This works in both Python 3.5.1 and 2.7.10.

Some discussion

A definition with no yield is a conventional function: the parameters from some domain are mapped to a return value in some range. Each mapping is a single evaluation of the function with concrete argument values.

A definition with a yield statement becomes an iterable generator of (potentially) multiple values. The return statement changes its behavior slightly. It no longer defines the one (and only) return value. In a generator function (one that has a yield) the return statement can be thought of as if it raised the StopIteration exception as a way to exit from the generator.

As can be seen in the example above, both statements are in one function. They both work to provide expected semantics.

The code which gets an error is this:

>>> def silly(n, stop=3):
...     for i in range(n):
...         if i == step: return "boom!"
...         yield i


The "why?" question is should -- perhaps -- be obvious at this point.  The return raises an exception; it doesn't provide a value.

The topic, however, remains troubling. The phrase "have both a yield and a return" is bothersome because it fails to recognize that the yield statement has a special role. The yield statement transforms the semantics of the function to make it into a different object with similar syntax.

It's not a matter of having them "both". It's matter of having a return in a generator. This is an entirely separate and trivial-to-answer question.

A Long Useless Rant

The email seems to contain an implicit assumption. It's the notion that programming language semantics are subtle and slippery things. And even "senior people" can't get it right. Because all programming languages (other then the email sender's personal favorite) are inherently confusing. The confusion cannot be avoided.

There are times when programming language semantics are confusing.  For example, the ++ operator in C is confusing. Nothing can be done about that. The original definition was tied to the PDP-11 machine instructions. Since then... Well.... Aspects of the generated code are formally undefined.  Many languages have one or more places where the semantics are "undefined" or only defined by example.

This is not one of those times.

Here's the real problem I have with the garbage aspect of the email.

If you bring personal baggage to the conversation -- i.e., assumptions based on a comparison between some other language and Python -- confusion will erupt all over the place. Languages are different. Concepts don't map from language to language very well. Yes, there are simple abstract principles which have different concrete realizations in different languages. But among the various concrete realizations, there may not be a simple mapping.

It's essential to discard all knowledge of all previous favorite programming languages when learning a new language.

I'll repeat that for the author of the email.

Don't Go To The Well With A Full Bucket.

You won't get anything.

In this specific case, the notion of "function" in Python is expanded to include two superficially similar things. The syntax is nearly identical. But the behaviors are remarkably different. It's essential to grasp the idea that the two things are different, and can't be casually lumped together as "function/iterator".

The crux of the email appears to be a failure to get the Python language rules in a profound way. 

Tuesday, August 16, 2016

Twelve Important Design Patterns

Read this: http://12factor.net/

Then. After reading it. Read it again to be sure you've got it. It's dense with best practices.

Now that you've read it, make yourself a Quality Engineering checklist.

I. Codebase: One codebase tracked in revision control, many deploys
II. Dependencies: Explicitly declare and isolate dependencies
III. Config: Store config in the environment
IV. Backing services: Treat backing services as attached resources
V. Build, release, run: Strictly separate build and run stages
VI. Processes: Execute the app as one or more stateless processes
VII. Port binding: Export services via port binding
VIII. Concurrency: Scale out via the process model
IX. Disposability: Maximize robustness with fast startup and graceful shutdown
X. Dev/prod parity: Keep development, staging, and production as similar as possible
XI. Logs: Treat logs as event streams
XII. Admin processes: Run admin/management tasks as one-off processes

If your app doesn't follow all of these patterns, you've got technical debt to work off. Start by posting the debt remediation stories in Jira (or whatever you're using.)

I've got config issues left, right, and center. Numerous assumptions include the URL's for RESTful services on which my RESTful services rely: this is not good.

Some of these things, however, are a done deed in the Python/Flask world with no real thinking required.

  • Build, release, run - done
  • Processes - done
  • Port binding - done
  • Disposability - done

Other things require some care. And the config is something that I've really got to fix.

Tuesday, August 9, 2016

That Feeling When... You're reading your own documentation because it's useful and (mostly) correct

I'm looking at code (as a man does) and I can't remember if there's a class that does X. There's a lot of code. I wrote almost all of it. And -- maybe it's the gin -- but I just can't recall if there's an X. It seems like there should be.

Scan. Scan. Scroll. Scroll.

Read. Read.

Wait!

I have a pretty good gh-pages branch for this. Sphinx-based. Mostly up-to-date. Let's look there.

Ahhh. So much nicer than scrolling through code. Indexes work.

This whole "documentation" thing is pretty cool. Now I'm actually happy that other people guilted me into doing it.

Tuesday, August 2, 2016

Lamenting the Death of Object-Oriented Programming. (Sigh) Again?

See Goodbye, Object Oriented Programming.

I don't want to say that the entire article is bunk. It's not. It raises a few good points. Points which I thought were pretty well known.

What's aggravating is that this lamentation is overly broad.  It treats all languages as if they're Java or C++. That's not true, and as a consequence, the article is less useful than it could be.

Banana Monkey Jungle Problem. Only true if you are sadly mistaken about the unit of reuse. The class as unit of reuse -- across projects -- is false, has been false, and will always be false. The idea of class inheritance for reuse makes perfect sense. Sharing individual classes between projects has never (as far as I know) been a promise of OO programming. Maybe I read the wrong books and missed that promise.

The Triangle Problem. Isn't actually a problem. Python has a defined method resolution order.

The Fragile Base Class Problem. This points out the well known issue with having concrete classes depend on other concrete classes. The SOLID design principles suggest concrete classes should depend on abstractions. Abstractions do not suffer (as much) from the fragile base class problem.

The Hierarchy Problem. I guess the idea that the real world is multi-dimensional can be confusing. If everything has to be force-fit into single inheritance, this would create the hierarchy problem. If we allow multiple inheritance, this problem evaporates.

The Reference Problem. Even C++ has "smart" pointer packages. Java has garbage collection. Python does reference counting. This is only a problem if you go out of your way to deal with pointers in a primitive way.

The part on Polymorphism didn't make any sense. There didn't seem to be a tidy problem. Just a confusingly vague statement that "Interfaces will give you [polymorphism?]. And without all of the baggage of OO". I don't get how interfaces are necessary without the baggage of OO. So, I can't really try to refute this.

In the long run, I guess this was a way to introduce some of the benefits of a functional approach. I'm not sure that this kind of criticism of object-oriented programming is very helpful. It doesn't apply to all OO languages, so it's misleading at best. (At worst, it's simply wrong.)

I think these problems are interesting and can be used to show the benefits of functional programming. But without the actual functional programming examples, this isn't very useful.

Tuesday, July 12, 2016

Getting Rid of the Gang-of-Four Design Patterns is Nonsense

Someone found Yet Another Post (YAP™) insisting that the Gang of Four (GOF™) patterns were on their last legs. The email was misleading, because this is not precisely what the article said. The bottom-line was that Design Patterns in general are merely a response to gaps in the underlying programming language. A position that's nonsense at its very foundation.

The lexicon of design patterns varies from language to language. GoF patterns aren't "going away." They're part of the Java/C++ world. They don't apply quite the same way to Python or functional languages.

There's a more serious issue, though: Language Mapping. First some background.

Design Patterns

Design Patterns will always exist. They're an artifact of how we process the world. We tend to classify individual objects so that we don't have to deal with each object as a separate wonder of nature.

It's Just Another Brick In The Wall.

We don't have to examine each rectangular solid of ceramic and understand the wonderfulness of it. We can group and summarize. Classify. Brick is a design pattern. So is masonry. So is wall. They're all patterns. It's how we think.

Design Patterns and Language Gaps

There's a claim that moving toward functional languages will kill design patterns. This presumes (partly) that non-OO languages magically don't have design patterns. This is (see above) kind of insane. Languages have design patterns. We recognize these patterns all the time.

A functional language has a common technique (or pattern) for visiting nodes in a hierarchy. We don't dwell on the wonderfulness of the code as if we'd never seen it before. Instead, we classify it based on the design pattern, and leverage this higher-level understanding to figure out why we're walking a hierarchy.

Sounding the death knell for design patterns also presumes (partly) that functional languages are magically more complete that OO languages. In this newer better language, we don't need patterns because there are no gaps. This is pretty much nutso, too. The Patterns Fill Language Gaps school of thought ignores the fact that there are many ways to implement these "gaps". We can use GoF design patterns, or we can use other software designs that don't fit the GoF design patterns. Both work.

The patterns aren't filling a "gap." They're providing guidance on how to implement something. That's all. Nothing more. Guidance.

"But wait," you say, "since I needed to write code, that's evidence that there's a gap."

"What?" I ask, incredulous. "Are you claiming that any code is evidence of a language gap? Does that mean all application software is just a language gap?"

"Let's not be silly," you say. "I can split a hair and create a tiny distinction between software I shouldn't have to write and software I should have to write."

I remain incredulous.

Design Patterns as Damage

The idea that somehow the GoF design patterns are a problem is also goofy. The GoF design patterns are pretty slick. They solve a fairly broad suite of problems in an elegant and consistent manner.

They're just good design.

Yes, they can be complex. Sorry about that. Software can be complex if you want really excellent flexibility and extensibility.

AND.

Bonus.

Software can be complex when you have to work around the problems of "compiler" and "locked libraries" and "no source." That is, the GoF patterns apply in full force for C++ and Java where you're trying to protect your intellectual property by disclosing only headers and obfuscated implementation details. Indeed, there are few alternatives to the GoF patterns if you're going to distribute a framework that has no visible source and needs to leave extension points for users.

If you don't have Locked-NoSource-Compiled code as a backdrop, the GoF patterns can be simplified a little. But some of the patterns are essential. And remain essential. There are some really great ideas there.

In Python world, we rely on a modified subset of the GoF patterns. They work extremely well.

When writing functional-style Python using immutable data structures (to the extent possible), we use a different set of design patterns. Not so many GoF patterns when we're trying to avoid stateful objects. But some patterns (like the Abstract Factory) are really very helpful even in a largely functional context. It morphs from an abstract factory class to a factory function, and it loses the "abstract" concept that's part of C++ and Java, but the core Factory design pattern remains.

The Serious Issue

The serious issue that is surfaced by the email is Language Mapping. We cannot (and must not) try to map languages to each other. What is true for Java design is emphatically not true for Python design. And it doesn't apply to assembly languages, FORTRAN, FORTH, or COBOL.

Languages are different.

There. I said it.

If there was an underlying "universal deep structure" behind all programming languages, the surface features would be merely syntax, and we'd have automated translation among languages. The universal deep structure (the underlying Turing Machine that does computations) appears to be too abstract to map well among programming languages. Hence the lack of translators.

When switching among languages, it's important to leave all baggage behind.

When moving from Java < 8 to Java >= 8<8 java="" to=""> (i.e., non-functional Java to more functional Java) we can't trivially map all design patterns among the language features. It's a new language with new features that happens to be compatible with the old language.

Attempting to trivially map concepts between non-functional (or strictly-OO Java) and more functional Java leads to dumb conclusions. Like the GoF patterns are dying. Or the GoF patterns represent damage or something else equally goofy.

The language changes lead to design pattern changes.

Language change doesn't deserve an gleeful/anguished blog post celebrating/lamenting the differences. It's a consequence of learning a new language, or new features of an existing language.

Please avoid mapping languages to each other.

Tuesday, June 21, 2016

Why Python? (Sad Follow-up)

In "Why Python?" I linked to a deep and sophisticated analysis of programming languages. Anyway, I thought it was a deep and sophisticated analysis.

I got a reply that shows how wrong I was. Here's the quote:
The point is that the Python ecosystem has a lot to offer. We could argue about the language design choices. However, why bother? Why not just take advantage of what the ecosystem has to offer.
Ah. Discussing the language is just "arguing". I guess the points are all debatable and my comparison of Python to any benchmark is just the seed for an argument. A religious war, perhaps. I guess this wasn't compelling. It was a "why bother?"

Why bother pointing out the strong points of the language?

The email emphasized the "ecosystem" with a cool, but short example of how scipiy.spatial.KDTree works. 

It appears that -- for some people -- "Python code actually works" is a useful response to "why python?" 

I would have thought that "Python code actually works" was a precondition to even discussing the value proposition behind Python.

But -- clearly -- I was wrong.  The mere fact of a working example is a Very Important Thing™.

What does this mean?
  1. There are people who use software that doesn't actually work. When they see software that works, it's important. Very important.
  2. When software actually works, these people find this simple fact to be a compelling and substantial argument for placing a high value on the software.
  3. Other considerations like clarity and simplicity aren't relevant. If these poor souls are suffering software that doesn't actually work, then broken and obscure is still broken. Other parts of the long discussion from Wirth are just arguing points.
The email included "consider amending the why python? blog w/ the other big pro: ecosystem" I'm not sure I actually understand the request. When code that works is a "big pro", this comes from a world I can't pretend to understand.

Also. The example code used xrange(). Which is a Python 2 smell. Those days are passed.