
Machine Learning - IUU fishing
August 20, 2025
R is designed to allow a person to quickly generate scripts, with professional standard charts and outputs in PDF and HTML formats. One can easily perform exploratory data anaylsis (EDA) with R and being a statistical language it supports statistics and machine learning very well. I used R for 4 years whilst studying, for my MSc project though I chose to use Julia. Whilst R has some very good strengths it also has some shortcomings, most notably:
The project I undertook and analysis was conducted using the Julia
general purpose and statistical programming language
(though reporting was predominately in R
, with geospatial plots created using QGIS). Performance is a main goal of Julia
which combine a macro pre-compilation step (though unlike C and C++, it is rare to write ones own macros), dynamic dispatch
and compilation options including ahead of time and just in time. The result is a language which makes performance tuning
simple, using the macros of @allocated
and @time
macros made assessing memory allocation and speed, the very basic
information required for tuning; whilst the macro @.
vectorised expressions which is an elegant and time saving approach,
finally @view
and @with
enabled pass by reference semantics to reduce memory allocations. Testing and test suites
had an understandable and flexible interface.
For R programmers looking to use Julia there is a package with Tidyverse
interfaces and features to ease language adoption.
Unlike R (which is a statistical language), Julia is a general purpose language, interacting with a relational database, HTTP calls and creating object definitions was found to be straightforward - the latter being a reasonable amount of work in R and there are confusingly many different ways to define objects in R each supporting different functionality.
Most Julia libraries are written in Julia so if there is an error which is not understood the source code can be read, only on one occasion was this absolutely required as the documentation is typically of a high standard and the Julia discussion board cover many common misconceptions. A suitable library was always easy to find and reassuringly on a project which used twenty-seven software packages the only difficulty in use was on performing bulk loads into the database (which is heavily database driver dependent) this was resolved in a couple of evenings early on in the project.
R is however a more forgiving language and is easier to learn, Julia like many general purpose languages benefits from having a more computer science background for instance in Julia numeric under and overflow is the default, an accepted tradeoff for greater performance. The real number ranges differed between Julia and Postgres, with Julia having a wider real number thus the Julia code needed to round numbers in some instance to ensure compatibility with the Postgres column type defined on the database table. The Julia libraries (like GO libraries) reflect the interface and protocols very closely, so if you already have a computing background the APIs are intuitive.
R has its strength in being easy to learn, use and quick to get started with. The R-Studio
Integrated Development
Environment (IDE) is preferable to VS Code
which was used for this project and though adequate for a university project it lacked the
finesse of other professional IDEs such as IntelliJ
or Visual Studio
. There are other IDEs available for Julia, the
Julia open source plugin for IntelliJ was used with the IntelliJ IDE at first but that kept crashing with valid code, VS
code was adopted for the project and given it worked, the focus remained on the project goals as opposed to finding a better IDE.
In summary if performance is not a concern, you work predominately alone, or you want an easy language to learn, or you are generating documentation ( possibly from inputs generated by another language) then R is a very good fit. For large datasets which require high performance, or you already have a computing background or where there are multiple people working on a project then I would suggest Julia is a better fit.