IronPython Bytecode Interpreter

One of the things that people would really like to have on IronPython is support for pre-compiled Python (.pyc) files. These files are pre-parsed and converted into the bytecode format that the Python interpreter actually runs. This speeds execution up for a couple of reasons:

  1. The application does not need to parse the source code prior to execution
  2. The operations are usually at least minimally optimized already by the parsing engine

I was intrigued by the Python bytecode, it looked like a fun little thing to play around with, so I developed very minimal and early support for bytecode interpretation in the IronPython codebase. You can see my commits at my fork of the IronPython repository on Github. There is still a LOT of work that needs to be done, but there are a lot smarter people out there who once the basic framework is in place can take it and run with it.

The commits above also add support for the ever important __hello__ and __phello__ imports. One more thing making IronPython more compatible with CPython.

The Greatest Classical Concert of All Time

This is not an historical account of any concert I have ever been to. It is more like a wishlist for the perfect concert of classical music that I can think of. These pieces are in no particular order.

Respighi – Pines of the Appian Way

This conductor is a bit overboard, but fun to watch.

Dvorak – New World Symphony

I can listen to this full piece over and over and over. The Largo (movement 2) is one of my favorite pieces of music ever written.

Tchaikovsky – 1812 Overture

I played this piece in high school symphony, it has left a lasting impression on me. There are several parts where the trombones rest for what seems like hundreds of measures, so the other trombone players and I came up with a nice story about a little Russian woman who’s village is razed while she is away and she goes on a quest for vengeance. Along the way she meets up with an Indian snake charmer and some other interesting people. The cannons at the end are her triumphal attack on the city of the people who killed her family. Don’t judge me, it was a long time to rest.

Tchaikovsky – Marche Slave

This is a piece that was done by the same symphony above the year before I was in high school. My sister was in choral groups and all the musical groups did a big concert, so I got to hear this piece and have loved it since.

Edvard Grieg – In the Hall of the Mountain King

One of the first classical pieces I remember being exposed to. I am not sure where or when, but I’ve always liked it.

Aaron Copeland – Fanfare for the Common Man

This piece has been used in movies a lot, but I don’t think it’s been overused. It’s still a moving piece to me. I love the combination of the percussion and brass. Simple.

There are other pieces I could add to this perfect concert, but I’m not going to.

Prototyping and Testing Groovy Email Templates

One of the features people have requested for the Jenkins email-ext plugin is an easier way to test out their templates for Groovy generated email content. (See JENKINS-9594). The email-ext plugin supports Groovy’s SimpleTemplateEngine for generating email body (and other areas). I haven’t had the change to implement this feature yet, but found a fairly easy way to test out templates for builds based on a previous build. This can be used in the Jenkins Script Console to test the templates on jobs. The Groovy code below will get a project that you specify by name, create a copy of it and then perform the build step for the ExtendedEmailPublisher with the previous build you want to test with. It even prints out the build log output from the ExtendedEmailPublisher running. You can change anything about the ExtendedEmailPublisher before calling perform that you might need to. This is not the final solution, I still plan on implementing this feature when I get the time to look into it more.

import hudson.model.StreamBuildListener
import hudson.plugins.emailext.ExtendedEmailPublisher
import java.io.ByteArrayOutputStream
  
def projectName = "SomeProject"
Jenkins.instance.copy(Jenkins.instance.getItem(projectName), "$projectName-Testing"); 
    
def project = Jenkins.instance.getItem(projectName)
try {
  def testing = Jenkins.instance.getItem("$projectName-Testing")
  def build = project.lastBuild
  // or def build = project.lastFailedBuild
  // see the <a href="http://javadoc.jenkins-ci.org/hudson/model/Job.html#getLastBuild()" title="Job" target="_blank">javadoc for the Job class</a> 
  //for other ways to get builds

  def baos = new ByteArrayOutputStream()
  def listener = new StreamBuildListener(baos)

  testing.publishersList.each() { p ->
    println(p)
    if(p instanceof ExtendedEmailPublisher) {
      // modify the properties as necessary here
      p.recipientList = 'me@me.com' // set the recipient list while testing
      
      // run the publisher
      p.perform((AbstractBuild<?,?>)build, null, listener)
      // print out the build log from ExtendedEmailPublisher
      println(new String( baos.toByteArray(), "UTF-8" ))
    }
  }
} finally {
  if(testing != null) {
    // cleanup the test job
    testing.delete()
  }
}

Update: Thanks to Josh Unger for a a few updates to the above to make it more robust

Groovy ‘def’ Jam

As a way to blow off some steam, I like to contribute to open source software. You might ask why, as a software developer who spends his entire day writing code would I want to spend my free time writing more software. I honestly don’t know the answer to that question. I find something enjoyable in giving something back, or something along those lines.

Anywho, one of the projects that I contribute to and have talked about on here before is the Jenkins continuous integration server. Jenkins uses an MVC model for displaying webpages and interacting with the API of the application, the views are created by using Jelly. I will be completely honest, I hate creating and using Jelly views. Whoever thought up “executable XML”…

I found out that you could also do views using Groovy, so basically you just use scripting when you want and use the tag libraries like you would from Jelly, but get this, it doesn’t suck!

I wrote a little utility to convert Jelly views to Groovy views, because when I found out I could convert the views in the email-ext plugin to Groovy from Jelly, I wanted to do it immediately, but who wants to convert by hand! We’re software developers, we don’t do things by hand. We’ll spend twice as long to write a tool as it would take to do it by hand, but then, by George, if we have to do it again, it will take milliseconds!

So, the tool is strangely called jelly2groovy because you have use that naming format when writing a tool like this, just in case you didn’t know that. I was converting the views in the email-ext plugin from Jelly to Groovy and the following code was generated.

f.entry(title:_("Default Subject"), help: "/plugin/email-ext/help/projectConfig/defaultSubject.html") {
  if(instance.configured) {
    input(name: "project_default_subject", value: instance.defaultSubject, class: "setting-input", type: "text")
  } 
  else {
    input(name: "project_default_subject", value: "$DEFAULT_SUBJECT", class: "setting-input", type: "text")
  }
}
}

The thing to notice in that code is the double curly brace at the end. I only have one place that outputs a closing curly brace in the conversion script.

if( doOutput && (elem.children().size() > 0) ) {
  out.writeLine("${'  ' * indent}}")
}

I set doOutput to either true or false depending on if the tag I am rendering needs it or not (the Jelly choose tag doesn’t need to be rendered, just the when/otherwise children).

So, somehow, doOutput was getting set to true, even though I set it to false inside the check for the ‘choose’ tag element.

Wha?!

The code is basically like this:

doOutput = true
...
if(tag == 'choose') {
  doOutput = false
} 
...
// iterate over children by calling the current method recursively
if(doOutput...) {
   // generate the closing curly brace
}

Not very complex. It turns out though that there is a subtle issue with the way I wrote the code and it all lies in a three letter keyword ‘def’

You can read a full description of the meaning of ‘def’ if you would like to do so, but it boils down the following: NOT putting def in front of variable definitions in Groovy is almost like if the variable were global, by putting ‘def’ in front of the variable declaration, it refines the scope of the variable to be local. Without the ‘def’ in front of doOutput, when I called the method recursively and the value of doOutput was set to true, it retained that value once it got back from the recursive call and so, the ending curly brace was rendered.

Once I figured that out, I added ‘def’ in front of some key variables, and things worked perfectly.

jelly2groovy is now working on several tags and does a good job of converting things over, obviously there are still tags I don’t handle and things I haven’t tried yet (taglibs!) but its coming along nicely.

Jenkins – Standalone Build Generator

I’ve blogged before about how we use Jenkins at work for our continuous integration solution. One thing that our previous CI solution had was the ability for developers to run a standalone version of the tool on their development PC’s to check out large scale changes that might break several applications. Jenkins is much more difficult to do this with, mainly because we are using Rational Clearcase for SCM. We could use the Jenkins server to build from individual development streams if we wanted to, but it would require that all the files be checked in before being able to build locally.

I came up with a Groovy script that runs after each Nightly Build that collects the jobs that are currently in Jenkins and generates a standalone zip file that developers can download and launch a local instance of Jenkins to do a build from their view. The script is used as a  Groovy build step.

Jenkins – Jelly to Groovy

At work we use Jenkins for our continuous integration setup. As I have mentioned previously I really, really like Jenkins (I blogged about Hudson previously, but we moved with the fork to Jenkins since it is more community driven).

I took over as the maintainer for the email-ext plugin, which allows you to configure the emails sent for failures and other build results to a much higher level than the default Mailer. You can have different triggers for different statuses, you can include various pieces of information in your email templates. You can even use scripts to generate the emails. See the wiki page above if you are interested in more information.

Jenkins uses MVC for displaying web pages and interacting with the system. You create a view template in either Jelly, which is an “executable” XML format, or Groovy which is a scripting language for the Java Virtual Machine (JVM). The Jelly format is VERY painful to try and debug what is going wrong if you have something going wrong. Errors are not easy to track down and it is VERY painful to do some things (like call methods on objects and define variables and conditionals and…well pretty much everything). Groovy, on the other hand, it very nice to work with. You have basically a full scripting language to use to your advantage. Conditionals, object creation, variables are all just as easy as if you were writing a simple script (which you are!).

I wanted to start migrating the email-ext plugin to use Groovy views, because I think it gives a lot of power when trying to do things with the Jenkins API. I hand ported one view and it didn’t really take that long, but as most software people realize at some point when dealing with XML, the computer could be doing this for me! I spent about 20 minutes or so writing this initial version of a Jelly to Groovy converter. For very simple views, it works great. Feel free to fork it on GitHub and send me a pull request with updates. Hopefully it will be useful to someone else.

https://github.com/slide/jelly2groovy

Migrate CodePlex Issues to GitHub Issues

The IronPython project is looking at moving and completely using GitHub for all project information: downloads, issues, wiki, etc. The main problem is that IronPython currently resides on CodePlex and CodePlex, sadly, does not provide an API for accessing anything. This means we need to use screen scraping to get the job done on the CodePlex side. On the GitHub side, they have a wonderful API that is very well documented and has libraries for many languages. BeautifulSoup is a library I have previously used for screen scraping from Python and it was a great experience, its a simple to use library.

Some goals for the script based on feedback from the project:

  1. Maintain history (comments) as much as possible
  2. Maintain component notations
  3. Maintain releases
  4. Migrate both open and closed issues
  5. Migrate attachments if possible

When doing screen scraping, I really like to use the developer tools from whatever browser I am using (usually Chrome) in order to make viewing the source and finding patterns in the HTML easier. I decided to scrape the information I needed in a couple different steps. I could get some of the information from the list of issues, but then I would also need to go to each individual issue page and scrape information from there.

I decided to use the Advanced view for the bug tracker on CodePlex because it had a lot of information that I could pull out right from the get go.

CodePlex Issues Advanced View

IronPython advanced view for issues.

You can see that we can get information like ID, Title, Status, Type, Priority and last update  (though the last update wasn’t really useful). It was also possible to grab the link for the specific issue for use later.

One thing I did when writing the script was setup the filters and sorting the way I wanted prior to grabbing the soup and then I used the direct link that can be found on the page to get the issues in the order I really wanted.

As you can see from the screenshot below, if you use the “Inspect Element” in Chrome it will show you the structure for each row in the list of issues.

CodePlex Issue Row

Row information from advanced view.

Each row of the advanced view has several pieces that we can pull out, and each row starts with “row_checkbox_” this makes it very easy to loop through each row using BeautifulSoup.

Could not embed GitHub Gist 4403233: Not Found

Each row can have information about who the issue is assigned to, if its currently closed or not as well as a link to the actual individual issue page that we will need later. I grabbed all this info and put it into a sqlite database so that I could update it once I parsed the individual issue page.

Could not embed GitHub Gist 4403233: Not Found

GitHub treats the severity and type of issue as labels, so I add the severity and type to the issue_to_label table with a foreign key into the issues table, this makes it easier later to add all the labels necessary. CodePlex will only show up to 100 items per page, so I regenerate the direct link with info on which page I want and parse each page to get all the issues.

Now that I have all the issues in a database, I select them all and iterate through them to parse the individual issue pages to grab all the information.

Could not embed GitHub Gist 4403233: Not Found

One thing to note here is that I actually used a different HTML parser for BeautifulSoup in different parts of the script. When parsing the Advanced View, I used “html5lib”, but while parsing the individual issues, I used the “html.parser.” The reason for this is because each parser treats uncompleted tags differently, one of them adds additional tags to make up for missing tags, the other does not. The HTML generated by CodePlex had some weirdness in the area of the descriptions of the issues, so using “html.parser” cleared some of those issues up and made the soup easier to work with.

While parsing each issue, there were four main areas that I wanted to get information from:

  1. Description
  2. Attachments
  3. Comments
  4. Metadata
CodePlex Issue Areas of Interest

Areas of interest

The description was pretty straight forward, I looked at the HTML for that area and found the following:

CodePlex Issue Description HTML

HTML for issue description area.

This was pretty easy to grab from the soup, but then I had an issue that there is possibly markup in the description content (bolds, italics, etc). So, I decided I would use the html2text module to convert the description into valid markdown that could be used directly on GitHub.

Could not embed GitHub Gist 4403233: Not Found

The attachments were also pretty easy, each one had a specific id that could be pulled out using BeautifulSoup:

Could not embed GitHub Gist 4403233: Not Found

Comments were a little trickier, they had several bits of information that I was interested in. I wouldn’t be able to maintain the person who made the comment on GitHub, but I wanted to keep when the comment was made and who made it, and add these items as comments on the GitHub issues, in the order they were made on CodePlex.

Could not embed GitHub Gist 4403233: Not Found

As you can see, with the understanding of how the HTML is put together, it is pretty easy to pull our the information you are interested in and even though CodePlex doesn’t have an API, they do put a lot of information into the HTML of the issues that can be parsed out.

The metadata area was also fairly well structured. It is just a table contained within a div with the id “right_side_table,” and looping through the tr elements and pulling out the info is a piece of cake again.

Could not embed GitHub Gist 4403233: Not Found

Some of the metadata was used to update fields for the issue itself, but the rest were added to the description under a header “Work Item Details” to maintain the history of the information when the issue was moved from CodePlex to GitHub.

Once all the data was put into the database, it was pretty easy to import into GitHub using the PyGithub module. The one bad thing about this module is the lack of good documentation. I had to figure a few things out by just looking at the source code as well as looking at the GitHub API documentation to see what was possible with the different API calls.

Since the GitHub part is really easy to comprehend and the majority of this article was to talk about screen scraping, I will just provide the code for the script in the gist below.

The end result of the imported issues list can be seen below on a practice run on GitHub.

GitHub Issue List

Issues after being imported to GitHub

The severity (high, medium, etc.), the type (task, feature, etc.) and the component are all turned into labels with nice color coding on some of them.

The script migrates any plaintext attachments over as Gists and then puts a link to the Gist in the description area. Binary attachments are left on CodePlex and linked to directly. It would be better to have everything in one place, but GitHub doesn’t really have a good way of attaching binary items to tickets (or any attachments at all in fact).

The full script can be seen here, feel free to fork and make improvements. I’d love to see any improvements you have made via pull requests.

After these messages… – Part 3 (sort of)

Aside

It’s been a while since I started the set of posts about porting CPython modules to IronPython. I still plan on coming back to this set of posts and show how I continue porting the library, things have just been VERY crazy at work lately and I’ve had to spend a lot of my at home time working. Needless to say, this is not the optimal solution and things should be quieting down a bit in the next few weeks. I promise I’ll come back and continue with the posts. You have my word!

Testing With IronPython

At my work, we do validation of system on a chip devices. We have our own internal hardware team that designs boards for us to use in testing. These boards are complicated pieces of equipment, containing multiple FPGAs, power supply components, and more. With this complication comes the possibility that something could go wrong with the board. I develop an application used by other engineers to use these boards to load and run their tests (which run on the embedded device). For a long time, in this application we had some simple board tests that would help debug issues as they came up. We started getting more and more products and the boards changed slightly and it was difficult to keep these tests up to par with what needed to be there for the different products. I had been playing with IronPython for some time on various projects at home and thought this might be a good place to use it. The idea being that changes are more easily made in scripts, and the hardware team could add new tests as problems crop up.

I decided that the unittest module for Python would do a great job of managing the tests that needed to be run and allowing grouping of the tests into test suites that could be run by different users.

I’m going to take you through the design and implementation of an application similar to that which I use at work to help our hardware team debug boards. It will not be the exact application I use at work, but will give you an idea of some of the simple, yet powerful things you can do with IronPython.

image

The UI is very simple: a simple menu with a single entry (File > Exit), a toolbar with four buttons, a listview that shows the currently loaded set of tests and an output window for the test output. Here you can see I loaded the test_datetime.py file into the GUI to run the datetime tests (and sadly it looks like there are failures in IronPython’s datetime module!).

I’m not going to go into heavy detail about the GUI development, that is not the important part of this exercise. Let’s walk through the steps necessary to load in a Python file that contains unit tests.

We’ve got a couple issues to solve for this application:

  1. How do we get the output from the unit tests to go to the output text box?
  2. How do we enumerate and show all the unit tests in the given file?

First things first, we need to get IronPython going inside of our application. Download the latest stable version from http://ironpython.codeplex.com (as of this writing, the latest is 2.7.1 which means its roughly compatible with CPython 2.7). For embedding IronPython into an application, I prefer to get the zip file release that contains the necessary assemblies. Unzip the distribution to any place you would like and add the following as references to your project:

  1. IronPython.dll
  2. IronPython.Modules.dll
  3. Microsoft.Dynamic.dll
  4. Microsoft.Scripting.dll
  5. Microsoft.Scripting.MetaData.dll

These are the basic assemblies that will be used for embedding IronPython into the application.

The creation of the GUI as I said, is not the important part of this tutorial, if you have questions about anything I did feel free to comment, or drop me an email.

So, lets start getting our application ready to run Python code.

We need an execution engine.For this application, I am only planning on supporting one engine for the entire application, I don’t have a need to support an engine per thread or anything like that. So, I’ll go ahead and create a ScriptEngine (Microsoft.Scripting.Hosting) at the class level of my main form.

Most often when I am embedding IronPython into an application, I will have a method to initialize the engine and setup any paths I may need, pre-import some modules and do various other initialization steps. Here is the InitializeEngine method.

A few things of note. You can see the creation of a ScriptScope object as well as the ScriptEngine object we already talked about. This ScriptScope can be thought of as the __main__ module for the execution of the Python code. With IronPython you can create multiple ScriptScopes per engine and execute your code in any of them and they are partially self-contained.

The code is also creating and setting up the stdout and stderr handlers for the ScriptEngine. This was an easy way to run the unit tests and get their output (since the default test running just prints the results to the console using the Python print statement). The OutputStream implementation is fairly simple and is shown below.

It uses events to send text to the main form when something is written to the stream. The InitializeEngine then sets the stdout and stderr for the ScriptEngine; now all text written to either stdout or stderr will be redirected and displayed in the output area.

Now that an engine is setup, and the stdout and stderr output is redirected, the InitializeEngine method pre-imports a few modules so that the script writer doesn’t need to do so. The first three (sys, os, and unittest) are standard Python modules that come in the Lib directory of the IronPython distribution. The last one is a helper module used to help enumerate and execute the unit tests.

The first item defined is a class which wraps around unittest tests that are found. It provides a mechanism to add listeners (callbacks) for the end of a test. This is how the GUI is updated when a test completes (color change for pass/fail status, etc.).

The second item (getTests) is a method which uses the unittest module to find all of the unittest compatible tests in the loaded modules. This is one of the key things to remember about embedding IronPython: if you can do something much easier in Python code than trying to pull out variables and call Python methods from C#, then do it. Write a nice little wrapper method that you can easily call from C# and have it do most of the work. This will save you a lot of time with trying to get things to work. Python is great at interrogating Python code, use it to its full benefit.

The last item (runTests) is the method that is called by the host application to actually run the tests. It receives, from C#, a list (Python list) of tests to run. If there is a completion callback passed in, it wraps up the test case with the TestermaticTestCase wrapper and adds it to the test suite. Then, to run the tests, its as simple as passing the list off to the TextTestRunner class from the unittest module.

So, let’s look at how all of this gets pulled together.

Testermatic Toolbar

 

 

The toolbar on the Testermatic application has four buttons. The first is used to load a new set of tests. It shows an OpenFileDialog and once the user selects a Python file (*.py) it will enumerate the tests in the module and populate the list of tests. It uses the method below to load the tests.

The Import method is as follows

The second button on the toolbar is used to actually run the tests. It creates a List (Python list) using the items in the ListView that are checked (remember the test object — the Python test object — was assigned to the .Tag property of each ListViewItem so it is easy to pull out).

You can see that this is interacting with the Python code in a different way than was done to retrieve the list of tests. This retrieves a ScriptScope (think module) object for the testermatic Python module shown earlier. It then gets the runTests method from that module and since it is a dynamic, it can call it just like a function. The TestCompleteDelegate is the callback for when the test completes.

The TestComplete method which is called just updates the UI based on which tests passed, which tests failed and which tests had an error (green for pass, red for fail, purple for error conditions).

The third item on the toolbar is just a reload button. This is helpful if you are editing the Python script containing the tests and you want to reload to get any new tests, or new changes you made to the file. This calls the same LoadTests method that was called when the file was selected. A new ScriptEngine instance and a new ScriptScope instance are created (and cleaned up if already created) every time the LoadTests method is called.

The final button in the toolbar is a simple little email capability, because when something fails its always nice to notify people about it. The button displays an email form as shown below to allow the user to enter To and From addresses as well as a message. The output from the tests will be added after the message and sent to both the To and From addresses.

Email Form

Now, do I use the application at work to run normal Python unit tests? No, as I mentioned before, this was mainly to allow the hardware team at my work to develop simple tests to quickly triage issues with the boards. They can quickly modify the test for special circumstances and run the whole suite of tests again. It also allows them to send the report to another person so they can review the test run.

I hope this simple little application has shown you how easy it is to incorporate IronPython into your application. I attached the project for this application below so you can download it and play around with it.

Testermatic Project

Setting up the environment – Part 2

Part 1 of this series is available here

To begin porting a C Python module to IronPython a few things are needed, I will explain how to get some of them and leave the rest up to the reader to research and download/install. I am assuming that you will be doing development on Windows, if you prefer Linux, there are equivalents available and I will note them below.

  • .NET Framework 4.0 SDK (Windows) or Mono 2.10.x (Linux)
    • This includes tools such as msbuild (xbuild for Mono), the C# compiler and other such tools. This is the bare minimum required for building IronPython from sources.
  • TortoiseGit (or any other Git client) for Windows or git (Linux)
    • This will be used to get the latest sources from Github for IronPython.
  • Visual Studio 2010 (Windows recommended) or MonoDevelop 2.8 (Linux)
    • You CAN build IronPython sources using only the .NET SDK, but having a good debugger makes life much easier.
  • Python 2.7.2 sources (http://www.python.org)
    • The Python sources are required so that you can look at internal implementations of C based modules to make sure your IronPython modules match the implementation.

If you are developing on Linux, I highly recommend either getting the latest Mono tarball, or building from source yourself. The latest Mono in the various Linux distributions’ repositories are not the latest and greatest release, and some of that latest and greatest stuff is nice to have when developing for IronPython.

If you are not familiar with open source development on Github, here’s a quick and dirty tutorial which should get you going.

Github is called a social development platform. The idea is to get developers communicating back and forth and provide ways for communities to work together. That being said, the major benefit of Github for open source projects is that is makes contributing very nice and much less painful than other methods. The general way to contribute to a project is to fork it. To fork a project on Github means that you create your own working area for the project, with it’s own source control history of your changes, and changes that you incorporate from other people into your branch of the source.

Obviously, working on Github requires an account, so if you don’t already have one, you’ll need to sign up for an account, then you can fork to your heart’s content. I would also like to mention that strictly speaking, you do not have to setup a Github account, if you have access to your own git repository and server, you could theoretically send patches to the IronPython team that way, but it is MUCH more preferred to work using the Github method.

When you browse to the IronPython project on Github (which is actually part of the IronLanguages project) at https://github.com/ironlanguages/main you’ll see something like below (depending on if you are logged in or not).

image

Github provides you several ways of getting the source for the project.

  1. Click the “ZIP” link right above the “Files” tab of the UI. This will grab the head (latest) of the source code and allow you to download it as a zip file. If you want to peruse the source and see how things are done before jumping in head first, this can be a good way of doing that.
  2. Depending on your role in a project (developer, administrator, etc.) you will have the option of having a direct committable URL for the project. There are SSH and HTTP options available for developers and administrators that allow directly pushing changes to the main project. Most people do NOT have this access.
  3. The other option, for those who are not in the project developers group directly, but would still like to get the source with all the Git information is the “Git Read-Only” option. This allows you to git [sic] the source and pull updates, but does not allow you to commit anything to the project.

These options are for if you want to get the source only and not really hack on it. If you want to hack on the source and contribute back new features or bug fixes, then you will want to fork the project1.

image

The important button in this case is the “Fork” button which is near the top of the page. As shown above, you have to be logged in to be able to fork a project. Forking a project creates your own personal work area. You are automatically added to the “developers” list for the project and given push access to your fork of the project. Github will tell you there is hardcore forking action going on and then redirect you to the project page for your fork as shown below.

image

You can see that the project in my case is now “slide / ironlanguages” not “IronLanguages / main.” When you fork IronPython specifically, you will actually end up with a project called “username / main” which in my mind was not very descriptive of what it actually was, but luckily Github lets you rename your fork. Click on the “Admin” button on the fork’s project page, you can change the name in the repository settings.

image

I renamed mine to ironlanguages, so I knew exactly what it was (I contribute to some other open source projects on Github, so keeping them straight is important).

Now that you have a fork, you are ready to get the sources and start hacking away. Browse to the directory on your computer where you would like to keep the source and use the Git client to clone the repository URL displayed on your fork2.

If you are using TortoiseGit on Windows, browse to the root of where you want the source and right click to show the TortoiseGit context menu entries.

image

Select “Git Clone…”

image

Most of the necessary information will be filled in for you when you paste in the URL for the repository you want to clone, if not use the image above to help.

You will notice that I am loading a Putty key, this helps with authorization and not having to enter a password for every operation with Github. I highly recommend you follow the how to on Github for setting up your keys3.

When you press OK, a progress dialog will appear showing you the progress of the clone.

image

This will take some time as there is quite a bit of code in the repository (including IronRuby, the DLR, Python files for the standard library, and more). If everything goes as planned, you will see something like the following:

image

Now, you should have a local copy of the repo in the directory you specified.

image

Interesting areas to look at in the code.

  1. Solutions directory – contains the Visual Studio solution files for the various projects that can be built (remember to use IronPython.Mono.sln if building on Linux).
  2. Languages\IronPython\IronPython.Modules – contains most of the “native” modules that are ports from C modules
  3. Languages\IronPython\IronPython – contains the actual implementation (parser, compiler, etc.) for IronPython

Once you have the IronPython sources on your local system, take a look at the C Python source code.

image

Interesting areas to look at.

  1. Modules – contains the implementation of the “native” modules (written in C), this is where we look for the C Python implementations for modules we want to port.
  2. Python – contains implementations of several key Python components such as the importer, dynamic loading, and more. This can be useful to determine what Python API function calls in the C source code are doing so the functionality can be replicated.

Next time, I’ll start pulling apart the C Python pyexpat module source code and create a skeleton IronPython module.

References

  1. Github has a much better tutorial on forking a project available at http://help.github.com/fork-a-repo/
  2. Github’s documentation http://help.github.com/ is pretty good, I look through it all the time.
  3. Look at http://nathanj.github.com/gitguide/tour.html under “Pushing to a Remote Server” for information on generating your own SSH keys for use with Github.