Science is at the basis of what we do at Cervest: we are led by research, and driven by data. Scientific research improves with collaboration, which is why our science team is not working alone but strengthened through partnerships with research institutes and academics. This includes PhD students whose research overlaps with our work. Today we’re introducing you to Harrison Zhu, whose PhD in Statistical Machine Learning is part of a collaboration between academia and industry, in this case Cervest.
What is the PhD programme ‘StatML CDT’?
The StatML Centre of Doctoral Training for Modern Statistics and Statistical Machine Learning at Imperial College London and Oxford University is a PhD programme that runs for 5 years at both universities. It aims to educate around 75 students to become the next leaders in Statistics and Statistical Machine Learning, with a strong focus on training for methodological advancements in the modern era. It is funded and supported by the Engineering and Physical Sciences Research Council (EPSRC), along with many industry partners such as Qualcomm, JP Morgan, Novartis and Cervest, as well as academic partners including Harvard, ETH Zurich and UC Berkeley.
What have you been working on?
So far, in terms of methodological problems, I am working on kernel methods and probabilistic numerics. For real-world applications, my research is mostly aligned to that of Cervest (with some side projects whenever there’s time!). My first PhD project was on the general framework of aggregate output methods for crop yield modelling. I explored the connections between different aggregate output models and experimented with Cervest’s proprietary remote sensing imagery datasets. It was a great project that lasted only 3 months, so there were definitely loose ends and more work to do. I’d like to get back to that as soon as I finish my current project which is in the areas of Bayesian optimisation, active learning and computer vision.
I’m also currently working with colleagues at Imperial on emergency responses for COVID-19 – which makes what I do feel very useful!
What’s a challenge you’re happy to have cracked or looking forward to tackling?
I have recently been working on a paper with a few collaborators on Bayesian numerical integration with tree-based models. Here we discovered a new methodology using regression trees, which are models that give you outputs after going through a series of logic gates. The whole setup is extremely elegant, and it was especially satisfying to see both the theoretical and experimental guarantees of competitive results to that of the current benchmarks using Gaussian processes. I mostly worked with the experiments and coding, and seeing dozens of threads running on max capacity on a high-performance computing server was extremely satisfying.
I look forward to formulating and improving current methodologies for crop yield modelling. This is important for food security, which is a pressing issue in the world nowadays as natural disasters are occurring more frequently and the world population is growing. My first mini-project for my PhD programme looked at yield modelling techniques using aggregate output models. Most of the time was spent building up the codebase that I could use in the future. Now that I have a solid theoretical foundation and codebase, I hope to obtain further deeper insights using the knowledge and tools that I have already built.
What’s your background prior to the StatML CDT and how did you find the programme?
I entered the CDT with quite a diverse mathematical and computational background, especially after 4 difficult years at Imperial and EPFL (in Switzerland). I took lots of very theoretical courses such as measure-theoretic probability theory and functional analysis, but also applied ones like deep learning, biostatistics, advanced regression and foundations of algorithms. It was quite easy for me to settle into the programme. The people I work with at Imperial and Cervest are very helpful and supportive which makes me constantly feel very productive because of the abundance of projects, high-performance computing resources and, most of all, knowledge! I also get to meet a lot of academics outside of Imperial and travel around to attend workshops, give talks and present posters, which is a lot of fun!
Why is working on climate change important to you?
Extreme events such as floods, heatwaves, wildfires and disease outbreaks are increasingly becoming threats to our society, both economically and for public safety concerns. As inhabitants of Earth, we need to adapt to this uncertain environment.
There’s been a lot of debates and public action regarding responsibilities towards the climate. For me, it’s more important to be directly involved by doing research on topics such as geospatial modelling. That way we can really build the tools that will allow us to understand how to reduce the expected impact of climate volatility on us. Cervest is building a platform and research group to achieve exactly this objective. This initiative is something that makes me feel more optimistic about our ability to address the climate emergency.
Where can people learn more about you? Twitter? Webpage?
I have my own academic and personal websites. I usually try and update my personal one whenever I have time – you can currently find a few tech blogs on the main page. I like open source, so you can also find me contributing to project code and committing (spaghetti ;)) code to my own repositories. I’ve been playing around with Twitter for a while as it seems like most academics in the machine learning community are extremely active there. I haven’t posted anything yet, but hopefully there will be more posts about both my own and Cervest’s research soon! I also do some advising for the Imperial Data Science Society (ICDSS), so you might be able to find me at their events and hackathons.
Finally, can you tell us something we might not guess about you?
Hmmm this is an interesting question…. I suppose being a massive coffee addict and so literally turning coffee into theorems (borrowing Renyi’s description of Erdos)!