After our collective session on Wednesday, and Saskia’s remarks about incremental practice, a few thoughts came to mind: first, it is quite difficult, in the current situation, to produce small works or prototypes given the knowledge I have (that is, the knowledge of Machine Learning, and especially Deep Learning), or at least so my brain says. I have come to doubt the efficiency of my approach, which until now has mostly been to work with relatively large chunks of knowledge and equally large projects, slow but deep, that I can only handle one at a time, etc. If I were to continue with this methodology for my final project, that would mean continuing to study Parag Mital’s course on Tensorflow, then find other resources for Machine/Deep Learning used to generate text, and only then start playing/producing bits of work.

This has to change.

The danger is too great that I end up lost in the depth (!) of learning, especially if that involves heavy-duty mathematics and slightly unwieldy coding tools (TensorFlow is not particularly easy to access). The solution would be to redirect my attempts toward smaller chunks of learning, and more specifically make time for the less ‘cutting edge’ parts of my project: go back to previous works, which I submitted but feel I could expand on and/or perfect, and simply do that. The projects I’m thinking of are mainly those three:

  • WordSquares (with tasks such as porting the database builder to Python, improve the whole algorithm using a Trie, find ways of getting the Word2Vec model to work, manage the Bokeh library for the visualisation of datasets even with large ones);
  • WordLaces (generalise database building and the search process, apply machine learning and Bokeh visualisation as above);
  • SubWords (port to Python, think of various forms and constraints, then apply the same database building & machine learning pipeline again).

As mentioned in another post, it would be interesting to get into the RiTa Library and produce micro-projects on a regular basis.

The trajectory toward the final show could be redesigned into a ‘braid’ composed of two main threads: the Machine Learning acquisition process (fat cat, heavy duty, with only a hope of getting it to become truly productive by the end of the summer, but a very important part of my future work), the constrained, language-based projects in Python and, perhaps, JavaScript (lean and mean, ideally many small iterations, leading to an archipelago of texts).


One main obstacle to small iterations even in the non-ML part is that some of my past projects are already quite ‘heavy’, as well as prone to exciting my systematic, obsessive nature: the WordSquares project, for instance, in order to be ‘complete’ and ‘fully operational’, requires not only to be able to build databases of squares composed of hundreds of thousands of squares (already quite time-consuming to generate), but then requires a fair bit of mining afterwards. This will still be my objective for the next few days.