print("Hello World!")
Hello World!
This post illustrates how to use Julia to create a gradient descent algorithm. What has not been introduced, however, is how to perform the data analysis using Julia in Quarto. This post will illustrate the workflow step by step.
First of all, refer to this Quarto.org, JuliaHub, and Patrick Altmeyer’s post. The first step is to install following components:
Second, when you create the new quarto document, make sure the yaml
header contains the jupyter
item. For example, the yaml of this post is:
title: 'How to use Julia in Quarto'
date: 'Mar 10 2024'
categories:
- Julia
- Quarto
format:
html:
code-summary: 'Code'
code-fold: false
code-line-numbers: false
jupyter: julia-1.6
After the installation, you should be able to run the julia code in quarto like:
Row | carat | cut | color | clarity | depth | table | price | x | y | z |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | String15 | String1 | String7 | Float64 | Float64 | Int64 | Float64 | Float64 | Float64 | |
1 | 0.23 | Ideal | E | SI2 | 61.5 | 55.0 | 326 | 3.95 | 3.98 | 2.43 |
2 | 0.21 | Premium | E | SI1 | 59.8 | 61.0 | 326 | 3.89 | 3.84 | 2.31 |
3 | 0.23 | Good | E | VS1 | 56.9 | 65.0 | 327 | 4.05 | 4.07 | 2.31 |
4 | 0.29 | Premium | I | VS2 | 62.4 | 58.0 | 334 | 4.2 | 4.23 | 2.63 |
5 | 0.31 | Good | J | SI2 | 63.3 | 58.0 | 335 | 4.34 | 4.35 | 2.75 |
6 | 0.24 | Very Good | J | VVS2 | 62.8 | 57.0 | 336 | 3.94 | 3.96 | 2.48 |
7 | 0.24 | Very Good | I | VVS1 | 62.3 | 57.0 | 336 | 3.95 | 3.98 | 2.47 |
Following the previous post, we can easily model a generalized linear regression using GLM
module:
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}
price ~ 1 + depth
Coefficients:
────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept) 5763.67 740.556 7.78 <1e-14 4312.17 7215.16
depth -29.65 11.9897 -2.47 0.0134 -53.1499 -6.15005
────────────────────────────────────────────────────────────────────────
Let’s do some more advanced measurement - Factor analysis:
using MultivariateStats
# only sample first 300 cases and four variables
Xtr = diamonds[1:300 , [:x, :y, :z]]
# with each observation in a column
Xtr = Matrix(Xtr)' # somehow the data matrix has size of (d, n), which is the trasponse of data matrix in R
# train a one-factor model
M = fit(FactorAnalysis, Xtr; maxoutdim=1, method=:em)
Factor Analysis(indim = 3, outdim = 1)
You can refer to this doc for more details for parameter estimation of factor analysis
Let’s quickly compare the results of lavaan
Flux is an elegant approach to machine learning. It’s a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia’s native GPU and AD support. Flux makes the easy things easy while remaining fully hackable. See more details in Juliahub.
using Flux, Plots
data = [([x], 2x-x^3) for x in -2:0.1f0:2]
model = Chain(Dense(1 => 23, tanh), Dense(23 => 1, bias=false), only)
optim = Flux.setup(Adam(), model)
for epoch in 1:1000
Flux.train!((m,x,y) -> (m(x) - y)^2, model, data, optim)
end
plot(x -> 2x-x^3, -2, 2, legend=false)
scatter!(x -> model([x]), -2:0.1f0:2)
# This will prompt if neccessary to install everything, including CUDA:
using Flux, CUDA, Statistics, ProgressMeter
# Generate some data for the XOR problem: vectors of length 2, as columns of a matrix:
noisy = rand(Float32, 2, 1000) # 2×1000 Matrix{Float32}
truth = [xor(col[1]>0.5, col[2]>0.5) for col in eachcol(noisy)] # 1000-element Vector{Bool}
# Define our model, a multi-layer perceptron with one hidden layer of size 3:
model = Chain(
Dense(2 => 3, tanh), # activation function inside layer
BatchNorm(3),
Dense(3 => 2),
softmax) |> gpu # move model to GPU, if available
# The model encapsulates parameters, randomly initialised. Its initial output is:
out1 = model(noisy |> gpu) |> cpu # 2×1000 Matrix{Float32}
# To train the model, we use batches of 64 samples, and one-hot encoding:
target = Flux.onehotbatch(truth, [true, false]) # 2×1000 OneHotMatrix
loader = Flux.DataLoader((noisy, target) |> gpu, batchsize=64, shuffle=true);
# 16-element DataLoader with first element: (2×64 Matrix{Float32}, 2×64 OneHotMatrix)
optim = Flux.setup(Flux.Adam(0.01), model) # will store optimiser momentum, etc.
# Training loop, using the whole data set 1000 times:
losses = []
@showprogress for epoch in 1:1_000
for (x, y) in loader
loss, grads = Flux.withgradient(model) do m
# Evaluate model and loss inside gradient context:
y_hat = m(x)
Flux.crossentropy(y_hat, y)
end
Flux.update!(optim, model, grads[1])
push!(losses, loss) # logging, outside gradient context
end
end
optim # parameters, momenta and output have all changed
out2 = model(noisy |> gpu) |> cpu # first row is prob. of true, second row p(false)
mean((out2[1,:] .> 0.5) .== truth) # accuracy 94% so far!
0.954
using Plots # to draw the above figure
p_true = scatter(noisy[1,:], noisy[2,:], zcolor=truth, title="True classification", legend=false)
p_raw = scatter(noisy[1,:], noisy[2,:], zcolor=out1[1,:], title="Untrained network", label="", clims=(0,1))
p_done = scatter(noisy[1,:], noisy[2,:], zcolor=out2[1,:], title="Trained network", legend=false)
plot(p_true, p_raw, p_done, layout=(1,3), size=(1000,330))