Thursday, February 10, 2011

Making Pizza with Python

    I read a post on Reddit recently from a guy who wrote a Python program to play Bejeweled automatically. Since I've always be interested in the concept of programs driving other programs, I decided to try to write a program to play a more sophisticated game. I chose Papa's Pizzeria, by Flipline Studios. You can find the game at Kongregate. In the game, you are Roy, the delivery boy of the pizzeria. You come to work one day, and Papa is missing. He left you a note to tell you he is gone and asks you to run the pizzeria yourself. Customers are already arriving at the door, so you have no choice but to try to measure up.

Roy running the shop

    To play the game, you have to juggle between different tasks: take orders from customers, make the pizzas by putting the right toppings at the right places, bake them for the right amount of time and finally cut them in the correct number of slices. Customers will then rate you on those four activities. The game starts easy with only two customers to serve in one day, using only one topping and very short baking times. Then it becomes more and more difficult with up to ten customers a day, each one with several different toppings at different places, with different cutting and baking times.

    To be successful, you must be able to multi-task. If you do each pizza in turn, customers will wait too long and give you bad rates on that criterion. But doing several things in parallel is tricky, especially since you run the risk of missing the right baking time. Imagine you have a pizza in the oven that's just about right. If you go take another order, it might take ten seconds, and when you are back at the oven, your lovely pizza will be too baked and you will receive a mediocre baking rate. So, you have to keep things in mind, organize your order tickets the best you can in the limited space that is available, be quick when placing toppings and cutting the pies. After some time, you begin to get the knack of it and you can run a ten-customer day in around ten minutes. And that's when it starts to feel repetitive. Time to write a bot!

Python to the Rescue

    What is needed to write a pizza baking bot? At the most basic level, three things: a way to recognize what happens on the screen, a way to act in the game, and some thinking between those two. I won't dare to call it AI, at least in my implementation. It sure won't pass the Turing test!
    If you want to jump right in and see the bot in action, skip the boring explanations and go to the last section.

Reading the screen

    So, how does my pizzabot know what's on screen? Well, I'm not ashamed to tell you there is no complex image recognition in play. The program takes screenshots of portions of the screen and compares the pixels with some expected values. It probably makes the bot more fragile and less flexible, but a hell of a lot easier to program and faster to run. It also helps that the game is somewhat slow: it's not an action game where you have to react in a tenth of a second.

    I used two main methods to read the screen. The first one is simply to compare one precise pixel with its expected color. For instance on the image below, what you can see is a close-up of the bottom left corner of an order ticket, where the baking time appears. I manually took screenshots of all possible values, blended everything on one image, chose a set of very significant pixels (pointed by the blue arrows) and read the expected RGB values of those pixels. When trying to determine baking time, the bot makes a small screenshot (the blue rectangle) and compares each significant pixel with its expected value. When there is a match under a given tolerance, it's done.

Reading baking time

    That's how the bot sees most game elements: button presence, beginning and end of order by a customer, baking time, number of slices, topping positions. Also, to determine the exact position of the game on the screen at startup the same technique is used, but not with one pixel, with an array of nine pixels that must match the green chequered background at the top left of the game area. Actually several positions can match, so the bot scans the screen in 5-pixel increments (which makes the scan 25 times quicker than on the whole screen) and once a possible match is found, it tries to gently slide to the top left as long as the new position is still a match.

    The other technique, used for reading toppings and topping counts is a simple comparison with reference images. Here, comparing only one pixel with one expected value would be too error prone. So I took screenshots of all possible toppings and all possible counts, cut small images, painted in pure blue the background portions of the images. The bot loads those images and computes a distance between what he sees on screen and the reference images. Blue pixels are not part of the comparison. I could probably have used some alpha channel, but I only had experimented with loading opaque PNG files with PIL and I figured out that blue was not too common in toppings (when was the last time you ordered a pizza with something that blue on it?).

Comparison of toppings from screenshot with reference images

Acting

    That's probably the least interesting part. Most of the work was to take reference screenshots of the different screens and do some measurement to find the location of actionable items: buttons, toppings, order ticket, cutting lines. All the coordinates are stored in dictionaries or constants or even used directly in the code. I know, I know, that's dirty... But the bot is not supposed to be a big enterprise project shared among tens of people and maintained over decades. It's supposed to be fun for me to write and for you to run. Nothing more. Hopefully the functions where those magic coordinates live have meaningful names like goto_topping_station() or click_save_for_later().

Thinking tactically

    During a game, the bot keeps a record of the current status: how many orders have been taken, in which state they are, are there customers waiting in the line, what's in the oven, for how long it's been there and how much time is left before it is perfectly baked... With all that information, it regularly makes a list of possible actions, assigns some priority to all of them and picks the top one to execute. That's all very simple. The priority is just a combination of static preferences and waiting time (I prefer putting quickly in the oven an already prepared pizza when a slot is available, then taking orders or serving customers). That way, the longer an order has been in a given state, the more likely it is to be chosen as the next to be processed.
    There is one subtlety: taking pizzas out of the oven must always have the highest priority. That is because if you miss the ideal baking time, you risk losing a lot of points. So the "out of oven" action is managed completely out of the priority queue. When evaluating possible actions, if a pizza is about to be cooked in less than a few seconds, the bot goes to the oven and waits for the perfect time. I felt it was needed because most other actions can take for 2 to 10 seconds and they cannot be interrupted. What is also helpful is that the game seems perfectly timed. There is actually no need to read the baking indicators in the oven: one step on the scale (one eighth of a turn) is exactly 22.5 seconds long. So the bot records the time when each order is put into the oven and computes the perfect time when it will be baked. Pretty easy stuff. Thanks again to Flipline Studio, that made it easier. Actually reading moving "needles" would have been more tricky.

No need to read the baking timers

Putting it all together

    That's almost all there is to say. The complete scenario of the driving function is the following:
  • find the offset of the game area
  • click on "Start game"
  • choose a saved game
  • then, for as many games as asked:
    • look for possible actions (take order, make pizza, put prepared pizza into oven, get pizza out of oven, cut and serve pizza)
    • execute the one that is deemed the most important
    • until the shop is closed and all orders have been processed
    • look at the results for five seconds
    • look at the tips for five seconds
  •     If anything goes wrong during a game (it sometimes happens), the bot exits.
    The five seconds waits at the end of each round is there so you have a chance to stop the program without losing your score. You can cleanly kill python and close the game.

Wanna try it?

    The whole source is hosted on github. Get it, and follow the directions:

Prerequisites:
  • Python 2.x. Works at least with 2.6 and 2.7. Don't know about Python 3.
  • Windows (sorry about that!) Can probably be easily adapted. Only a few functions do win32 calls (event generation only)
  • PIL for screen capture and image file loading
  • You should also have played the first day of the game: that one is special because of the intro but mainly because the tutorial is a bit intrusive.

Run the bot:
  1. launch your favorite Python shell. I really like IPython, but the standard shell works the same.
  2. import botutil, that's the only source file
  3. open a browser window on the game (here's the address again on Kongregate)
  4. make sure the whole game area is visible, especially the top left corner: that's where the bot looks first. Also make sure you are on the title page (with Papa's Pizzeria title and the three yellow buttons)
  5. launch the bot using a given save game for a number of rounds with botutil.start_game(save, rounds). save should be 0, 1 or 2. rounds could be anything more than zero
  6. watch the bot making perfect pizzas. It always gets 100% on toppings and cutting. It gets 100% most of the time on baking, and between 90% and 100% on waiting time, which usually gives a general score around 99%.

    If it doesn't work for you, just ask me: maybe I forgot something. And if it works for you, you can tell me too.
Also, I'm not a very experienced Pythonista yet, so feel free to tell me where I'm using the wrong idioms or where I have used over-complicated code.

And enjoy your lovely baked Python pizzas!

P.S. Video now available.

14 comments:

  1. Great, I also did this some time ago (a different game) but in Java, I'll tell why I didn't do it in Python. I assume you take your screenshots using PIL's ImageGrab. Unfortunately that module only works on Windows :(

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Excellent work. It was really fun to watch. My bot did however have a bug that seemed to be completely reproducible. After a pizza was successfully delivered and the score was shown, it tries to go back to the lobby, but the screen always ends up on the cutting board, as if it clicked too soon. When at the cutting board, the mouse cursor is over the green button, and if I manually click, the game can continue as normal, but if I do not, I will get a timeout error.

    If you need any additional information on this, I would be happy to provide.

    ReplyDelete
  4. Interesting project. I've always wanted to do something like this (with no concrete idea of which game to use, mind you). Now I have some code to play with and with a language I know well enough (it would be more fun in Lisp, though :D )

    Cheers,

    Ruben

    ReplyDelete
  5. Nice project, just thought can we make burger with python also :)

    Cheers
    Javin
    Why String is immutable in Java

    ReplyDelete
  6. @André
    Yes, I'm using PIL ImageGrab. I'm currently looking at another solution: porting the platform-specific parts of Java AWT Robot to a Python module. Since Sun's Java is under the GPL, I think borrowing that code to make a GPL module would be ok. I might need some help to compile the native parts on all platforms.

    @Doug
    Do you mean that each time a pizza is cut and served the bot fails to go back to the lobby? I just pushed a small modification to the "finish_order" function, adding a small delay. I would be glad to know if it fixes your problem.

    @javarevisited
    You're right. Burgers might be my next target. ;)

    ReplyDelete
  7. By the way, here is a video of a day played by the bot, with the console visible
    http://www.youtube.com/watch?v=oQNkTCY6tkQ

    ReplyDelete
  8. Yes, the new change does in fact fix the problem I described, and it works perfectly now. I can't wait to see it on the later levels.

    ReplyDelete
  9. Getting this error:
    Traceback (most recent call last):
    File "", line 1, in
    File "botutil.py", line 740, in start_game
    print("Origin: (%d, %d)" % origin)
    TypeError: %d format: a number is required, not NoneType
    What am I doing wrong?

    ReplyDelete
  10. @m4g1ch
    Are you sure the whole game area is visible when you start? It seems the bot cannot find the top-left corner of the game.
    Or maybe the colors don't match...

    ReplyDelete
  11. Here is a picture:
    http://img507.imageshack.us/img507/6448/pizzabot.jpg
    Thank you for your interest!

    ReplyDelete
  12. @m4g1ch
    I reproduced the problem.
    I just change the color tolerance from 3 to 6 and submitted the patch to github. It works for me now. Maybe a new version has been deployed with slightly different colors. Or a new version of the Flash Player might also have changed the colors.

    ReplyDelete
  13. That did it! The weird thing is that I think I tried changing the tolerance too, but didn't work :O Anyways, great job!! I got 100% in all except baking (99%). Hopefully, I will learn a lot from this bot (I'm trying to learn Python).
    Thanks again!

    ReplyDelete
  14. Maybe you just forgot to reload the module after your change.
    With IPython, you can do a "reload botutil" after each modification.

    ReplyDelete