The end of the May and the begging of June is a tough time for me. I’ve passed my final exam for Master Degree, and, finally, I am a M.S. in Computer Science. Now I’m preparing for the Ph.D. exams. Meanwhile, I’m working on Splash during this summer. I’ve got something to tell you.
The first thing that I should do is
splash:with_timeout API. It allows to wrap any function and let it run only a specified amount of time in the Splash Lua scripting.
Originally, I was going to add a timeout functionality only to one API -
splash:go, but after having discussed it with my mentor, we decided to create a more general API.
There are two possible ways to implement this API:
- Using Lua.
- Using Python.
Both of them has their advantages and disadvantages. The implementation with Lua is more simple. It requires to use
splash:wait existing APIs and also some kind of polling (infinite loop) to check whether the running callback finished its work or not. Polling isn’t the best solution, because it requires CPU to do a lot of unnecessary work.
On the other hand, the Python implementation can know when the callback is finished its execution and notify the main event loop. Also, it’s more agile and configurable. So, we decided to write this API using Python.
The first thing that you should think of is the callback execution. In the current Splash version there are some API functions that take as an argument a callback. These callbacks are executed as coroutines. They are created using
Splash#get_coroutine_run_func method. Earlier, there wasn’t any need to stop the execution of the running coroutine. However, the main idea of
splash:with_timeout is the ability to run a function only the specified amount of time.
The first solution was just ignore success and error callbacks from the running function. The idea is simple but not correct. Consider the following example of Lua script:
function main(spash) local ok, result = splash:with_timeout(1, function() splash:wait(2) assert(splash:go("https://google.com")) end) splash:go("https://www.python.org") splash:wait(3) return splash:url() end
The first argument of
splash:with_timeout is the amount seconds you want to wait and the second one is your callback. As you can see, we set the timeout to 1 second and in the callback we’re waiting for 1 second then trying to go to the https://google.com. After that we’re navigating to the https://www.python.org. Then waiting for 3 seconds and returning the current URL. The result URL, obviously, should be https://www.python.org, because the callback of
splash:wait_timeout would exceed its timeout. However, the result URL will be https://google.com. The reason is that we didn’t stop it when 1 second has elapsed.
So, I’d implement the coroutine stop functionality. I added a new method to
BaseScriptRunner which is
BaseScriptRunner#stop. It sets the flag
True and during the coroutine execution that flag is checked: if it’s True
StopIteration exception is raised and the coroutine stops its execution.
There was an interesting conversation with my mentor related to how I’d handle errors from the callback of
There are two ways to handle errors and exceptions in Splash Lua scripts:
- Return a flag
okwhich tells whether the operation was successful or not and
resultwhich contains the result or the reason of an exception of the operation.
resultwhich contains the result of the operation and raise an exception using
error(...)if the operation failed.
In Lua, exceptions are thrown only if a user did something wrong (e.g. passed wrong arguments), so we’ve chosen the first solution because the timeout of callback isn’t related with the user itself rather than the API implementation.
You can see my work in this PR#465.
This week I’m finishing with all my exams and I can start spend more time on working for GSoC.
Thank you for reading. See you next time