How to use Julia in Quarto – Jihong Zhang, Ph.D.

1 Previous posts

This post illustrates how to use Julia to create a gradient descent algorithm. What has not been introduced, however, is how to perform the data analysis using Julia in Quarto. This post will illustrate the workflow step by step.

2 Initial Setup

First of all, refer to this Quarto.org, JuliaHub, and Patrick Altmeyer’s post. The first step is to install following components:

IJulia
Revise.jl
Jupyter Cache

Terminal

using Pkg
Pkg.add("IJulia")
Pkg.add("Revise")
using Conda
Conda.add("jupyter-cache")

Second, when you create the new quarto document, make sure the yaml header contains the jupyter item. For example, the yaml of this post is:

title: 'How to use Julia in Quarto'

date: 'Mar 10 2024'
categories:
  - Julia
  - Quarto
format: 
  html: 
    code-summary: 'Code'
    code-fold: false
    code-line-numbers: false
jupyter: julia-1.6

After the installation, you should be able to run the julia code in quarto like:

print("Hello World!")

Hello World!

3 Import dataset

# import packages
using DataFrames
using CSV
# load in the diamonds.csv
diamonds = DataFrame(CSV.File("diamonds.csv"))

first(diamonds, 7)

7×10 DataFrame

Row	carat	cut	color	clarity	depth	table	price	x	y	z
	Float64	String15	String1	String7	Float64	Float64	Int64	Float64	Float64	Float64
1	0.23	Ideal	E	SI2	61.5	55.0	326	3.95	3.98	2.43
2	0.21	Premium	E	SI1	59.8	61.0	326	3.89	3.84	2.31
3	0.23	Good	E	VS1	56.9	65.0	327	4.05	4.07	2.31
4	0.29	Premium	I	VS2	62.4	58.0	334	4.2	4.23	2.63
5	0.31	Good	J	SI2	63.3	58.0	335	4.34	4.35	2.75
6	0.24	Very Good	J	VVS2	62.8	57.0	336	3.94	3.96	2.48
7	0.24	Very Good	I	VVS1	62.3	57.0	336	3.95	3.98	2.47

4 Basic Statistical Modeling

Following the previous post, we can easily model a generalized linear regression using GLM module:

using GLM
lm_fit = lm(@formula(price ~ depth), diamonds)

StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

price ~ 1 + depth

Coefficients:
────────────────────────────────────────────────────────────────────────
               Coef.  Std. Error      t  Pr(>|t|)  Lower 95%   Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept)  5763.67    740.556    7.78    <1e-14  4312.17    7215.16
depth         -29.65     11.9897  -2.47    0.0134   -53.1499    -6.15005
────────────────────────────────────────────────────────────────────────

Let’s do some more advanced measurement - Factor analysis:

using MultivariateStats
# only sample first 300 cases and four variables
Xtr = diamonds[1:300 , [:x, :y, :z]]
# with each observation in a column
Xtr = Matrix(Xtr)' # somehow the data matrix has size of (d, n), which is the trasponse of data matrix in R 
# train a one-factor model
M = fit(FactorAnalysis, Xtr; maxoutdim=1, method=:em)

Factor Analysis(indim = 3, outdim = 1)

You can refer to this doc for more details for parameter estimation of factor analysis

loadings(M)

3×1 Matrix{Float64}:
 0.8294175777737991
 0.8157441937710099
 0.5052202721703213

Let’s quickly compare the results of lavaan

library(ggplot2)
library(lavaan)
data('diamonds')
X = diamonds[1:300, c('x', 'y', 'z')]
fa_model = "
F1 =~ x + y + z
"
fit = cfa(fa_model, data = X, std.lv = TRUE)
coef(fit)[1:3] # factor loading
    F1=~x     F1=~y     F1=~z 
0.7802245 0.7673664 0.4752576

5 Some Popular Julia Package

5.1 Flux

Flux is an elegant approach to machine learning. It’s a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia’s native GPU and AD support. Flux makes the easy things easy while remaining fully hackable. See more details in Juliahub.

using Flux, Plots
data = [([x], 2x-x^3) for x in -2:0.1f0:2]

model = Chain(Dense(1 => 23, tanh), Dense(23 => 1, bias=false), only)

optim = Flux.setup(Adam(), model)
for epoch in 1:1000
  Flux.train!((m,x,y) -> (m(x) - y)^2, model, data, optim)
end

plot(x -> 2x-x^3, -2, 2, legend=false)
scatter!(x -> model([x]), -2:0.1f0:2)

Code

# This will prompt if neccessary to install everything, including CUDA:
using Flux, CUDA, Statistics, ProgressMeter

# Generate some data for the XOR problem: vectors of length 2, as columns of a matrix:
noisy = rand(Float32, 2, 1000)                                    # 2×1000 Matrix{Float32}
truth = [xor(col[1]>0.5, col[2]>0.5) for col in eachcol(noisy)]   # 1000-element Vector{Bool}

# Define our model, a multi-layer perceptron with one hidden layer of size 3:
model = Chain(
    Dense(2 => 3, tanh),   # activation function inside layer
    BatchNorm(3),
    Dense(3 => 2),
    softmax) |> gpu        # move model to GPU, if available

# The model encapsulates parameters, randomly initialised. Its initial output is:
out1 = model(noisy |> gpu) |> cpu                                 # 2×1000 Matrix{Float32}

# To train the model, we use batches of 64 samples, and one-hot encoding:
target = Flux.onehotbatch(truth, [true, false])                   # 2×1000 OneHotMatrix
loader = Flux.DataLoader((noisy, target) |> gpu, batchsize=64, shuffle=true);
# 16-element DataLoader with first element: (2×64 Matrix{Float32}, 2×64 OneHotMatrix)

optim = Flux.setup(Flux.Adam(0.01), model)  # will store optimiser momentum, etc.

# Training loop, using the whole data set 1000 times:
losses = []
@showprogress for epoch in 1:1_000
    for (x, y) in loader
        loss, grads = Flux.withgradient(model) do m
            # Evaluate model and loss inside gradient context:
            y_hat = m(x)
            Flux.crossentropy(y_hat, y)
        end
        Flux.update!(optim, model, grads[1])
        push!(losses, loss)  # logging, outside gradient context
    end
end

optim # parameters, momenta and output have all changed
out2 = model(noisy |> gpu) |> cpu  # first row is prob. of true, second row p(false)

mean((out2[1,:] .> 0.5) .== truth)  # accuracy 94% so far!

0.954

using Plots  # to draw the above figure

p_true = scatter(noisy[1,:], noisy[2,:], zcolor=truth, title="True classification", legend=false)
p_raw =  scatter(noisy[1,:], noisy[2,:], zcolor=out1[1,:], title="Untrained network", label="", clims=(0,1))
p_done = scatter(noisy[1,:], noisy[2,:], zcolor=out2[1,:], title="Trained network", legend=false)

plot(p_true, p_raw, p_done, layout=(1,3), size=(1000,330))

plot(losses; xaxis=(:log10, "iteration"),
    yaxis="loss", label="per batch")
n = length(loader)
plot!(n:n:length(losses), mean.(Iterators.partition(losses, n)),
    label="epoch mean", dpi=200)