Terence Parr

+ Follow
since Jan 13, 2010
Cows and Likes
Total received
In last 30 days
Total given
Total received
Received in last 30 days
Total given
Given in last 30 days
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Terence Parr

It was my pleasure! Code long and prosper.
Hi. it uses antlr, if i remember, but i've not played with it.
that's fine. you just have to keep doing delegate.method() instead of just method, which can screw up your internal DSL and make it just a library.
Nope. I meant that you put the implementation all in one place that you can reference by name (i.e., in a class). Then when you want to use it (i.e., the methods), just inherit from the impl. For example,

hiya.did a quick scan. it looks like he's advocating (correctly) that you can put your internal DSL methods into a class to encapsulate it. Then, to use the DSL you can inherit from that class so you don't entangle your use of the DSL with its definition.

Vijitha Kumara wrote:What are the things included in explaining about creating General Programming Languages in the book?

Implementing DSLs and general-purpose programming languages have a huge amount, hence, it makes a lot of sense to deal with them together in the implementation patterns book. The difference lies largely in degree. For example, in a simple scripting language, you might only have one scope of variables whereas in Java you have all sorts of nested scopes. The book identifies the various symbol table patterns common to most programming languages and scripting languages. Either you have a single monolithic scope, nested scopes like C, data aggregate scopes like C structs, and finally class hierarchies and object-oriented languages like C++. I think of configuration files such as property files as being scripts. There is a single global scope in which we put all the properties.

Burk Hufnagel wrote:So then my pizza example would be considered a valid internal DSL - right?

Would probably count as an internal DSL as it reads pretty well as a pizza language, despite being in Java. Smalltalk for example could do a little bit better:

That is definitely an internal DSL so I guess the Java version is too. If you modified the library so it didn't read so well, it would probably tiptoe over the blurry line into just an API; check this out:

The and() and with() methods here are the key distinguishing element between an internal DSL and just an API. Heh, maybe we should write that down somewhere. oh, we just did! thanks for the example, Burk!

dhaval yoganandi wrote:I was thinking internal languages of companies as DSL.

By internal, Martin Fowler means written with in the constraints of another language. More like a library than new syntax.

Burk Hufnagel wrote:Dang. I thought I was getting the hang of it. So are you saying that when creating a DSL in Java we need to pass one or more Strings to an interpreter/renderer so that we're not using the Java syntax?

Ok, I just read Martin Fowler's internal DSL definition again. I believe we are in agreement. He says, "Internal DSLs are limited by the syntax and structure of your base language." SO, they are still valid programs in, say, Java or Ruby but you're trying to make it look like another language (a DSL). Ruby lets you do this easily; see Ruby on Rails. Java not so much. You can say "a.add(b)" for but ruby lets you say "a + b" for some weird types like 3D vectors for a game language. In C/C++, the preprocessor lets you do some fun stuff like:


Technically, that's not C though. The C compiler only sees the result of running the macro preprocessor on that.

Internal DSLs are the quickest way to get rolling since you are really just building a library, which you do everyday. When the syntax of the implementation language restricts the expressiveness of your DSL too much, time to build an external DSL. When you need to build a parser, you truly have an external DSL on your hands.

Look at it this way. Internal DSLs are just proper subsets of a programming language so your creativity is constrained and, of course, nonprogrammers can't use them. Internal DSLs *are* libraries so I think of them as that. But, I'm biased towards inflexible languages like Java. In Ruby, I could make function calls that don't look like calls. This let's me create Ruby programs that don't look like what I read in the tutorials...a negative in my view, btw.

Just to confuse you, the char within a string can be anything you want so you can put SQL or StringTemplate or whatever you want in there. The same string can live comfortably in python, C, ruby, SNOBOL, Java, C# etc... To execute that string, you need to implement a parser and an interpreter or translator.

Burk Hufnagel wrote:Great question Neha! Please Terence, inquiring minds want to know. Is there a mailing list or Google group for DSL designer/implemeters that you're aware of? Or, perhaps some web site where they congregate to share mistakes and successes? If not maybe you could start some centered around the book's web site.

Let's take design first. I think exposing yourself to lots of languages is the key to avoiding "sins". Try to pick out what you like and dislike from some languages. Think about how languages evolve. For example, Java has gotten WAY to complicated for my taste. Some of the weird edge cases are truly frightening in complexity. It also lacks gotos, which sucks for a guy like me that has to *generate* Java a lot. Try to look at everything through the lens of a language designer. during the day tomorrow while doing your job, try to pick out the languages you see (and imagine the hidden ones): command-line shell, URL, HTTP, SMTP (mail), network protocols, programming languages, HTML, XML, config files, ...

As for implementation, you can hang out on the antlr-interest list or comp.compilers newsgroup (though I admit I don't read that anymore). Stuff pops up on various lists/forums, but too much content focuses on compilers. That's why I wrote this book and teach a non-compiler language course at Univ. of San Francisco. Almost no one is writing a compiler but all of us are writing little languages for everything from dat files to graphics DSLs etc...

Burk Hufnagel wrote:So if I create a pizza builder class that has a "humane" interface and lets you order a pizza like this:
Pizza myPizza = PizzaBuilder.makeA(LARGE).pizza().with(THIN_CRUST).and().with(PEPPERONI).and().with(EXTRA_CHEESE);
I've actually created a DSL for placing a pizza order.

Some may disagree, but I don't think that's a DSL. it looks like Java to me with a nice API. That's a good idea, but not a DSL. Otherwise, all of my code would be in a multiple DSLs. ;) Your 2nd example has a DSL in the string: "+pepperoni +triplecheese" etc...

Internal DSL to me is one where you create NEW syntax but within the existing framework of the programming language. 'course that ain't my term. It's Fowlers'. I should probably go read his def again.

nehaa arora wrote:Wow! that is indeed an interesting view to problem solving! but it would need a lot of experience to know you are not hammering down your screws?!

Yep, experience helps in two ways: knowing when to go DSL and being able to get their quickly. It also helps in designing the language (syntax/semantics)...so three ways.

I cringe when see of the languages out there. I remember a friend, who was designing a new programming language, telling me about a bug he had in his parser. It allowed what he thought was a cool language feature. It was a weird special case. Every tool that parsed that language would have to have a bug to match it.

Burk Hufnagel wrote:Do they include error handling (how to deal with improper syntax, or other errors) or do they assume correct input and leave the implementing the error handling as an exercise for the reader?Burk

It talks about error handling in the parsing patterns but the (ANTLR reference talks more about this as error handling is often highly specific to a tool. It DOES, however, talk about how to compute type information and detect operations with incompatible types. Semantic errors not syntax errors.
Hi Keith,

You're right that sometimes a well-designed API is better because you don't have to learn a new language. On the other hand, some DSLs are usable by nonprogrammers. I guess my rule would be that you should avoid a DSL unless you are truly much more productive writing in the DSL over an API. a+b is better than a.add(b) but not so much better that I'd bother creating a language. It all depends on how much it saves you. I've been playing around with the grammar unit testing tool, gunit, and each "this should give you that" testing pair saves me a whole bunch of cut-and-paste, but more importantly it's much more readable than the equivalent raw junit (I'm using gunit to translate my grammar fragment tests to junit tests). gunit is at least a 5 to 1 compression over raw code.

Don't forget that ANTLR is itself a DSL for building DSLs. We get huge compression going from handbuilt recursive descent parsers down to a grammar. Yes, people have built parsing libraries but they are never as easy to read as a grammar. As another example, consider the simple graphics DOT language used by graphviz. We could write a series of method calls that create nodes and connects them up, but it seems much easier simply to say "a -> b". It only took me a second to glance at their manual to figure out the language. I'd wager it would've taken a lot longer to figure out a library. There's nothing for free so I guess we have to figure out how the API works or learn a language.

My general progression is to start by building a library. If I end up doing a lot of cut-and-paste and figure that I will need this functionality in the future, I will design a simple DSL that is much more expressive. 'course I've gotten pretty fast at building little language applications.

I'm editing this to add an example from the book in section "Creating Target-Specific Generator Classes". Let's say we want to generate bytecodes using an API. Here is the sample code from the BCEL manual to create the method definition object for a hello world main():

That seems like a lot of work just to define a method. All of these constructor calls really add up quickly to a very large code generator.

Rather than using these target-specific generator classes, it's easier use a few print statements to generate a method definition in a bytecode DSL. For example, using the Jasmin bytecode assembler, here’s what main( ) looks like:

This is much more clear. Of course, we have to learn a tiny bit of syntax, but that's easier than learning the vagaries of a library such as BCEL. So DSLs count for output not just input