by jsendak | Nov 1, 2024 | Art
edition in Hong Kong. As one of the most prestigious art fairs in the world, Art Basel has been instrumental in shaping the global art market and promoting cultural exchange. The decision to hold the event in Hong Kong is significant, considering the city’s rich history as a global hub for trade and cultural exchange.
Hong Kong has always been a meeting point of East and West, a place where different cultures collide and blend. Its strategic location and vibrant atmosphere have attracted merchants, artists, and explorers throughout history. From its days as a British colony to its current status as a Special Administrative Region of China, Hong Kong has continuously evolved, embracing both Eastern and Western influences.
The presence of Art Basel in Hong Kong further solidifies the city’s position as an international center for contemporary art. The fair showcases a diverse range of artworks, representing artists from around the world. It provides a platform for galleries, collectors, and art enthusiasts to come together, fostering dialogue, and promoting creativity.
With its meticulous selection process and high standards, Art Basel curates an exceptional array of artworks that reflect the current trends and themes in contemporary art. By highlighting the works of both established artists and emerging talents, the fair not only pushes the boundaries of creativity but also supports the growth and development of the art industry.
Art Basel Hong Kong 2025 promises to be an exceptional edition, with a lineup of esteemed exhibitors and exciting highlights. It is an opportunity for art lovers to immerse themselves in the dynamic art scene of Asia and explore the cutting-edge works being produced in the region.
Amidst the global challenges and uncertainties, Art Basel Hong Kong offers a beacon of hope and unity, reminding us of the power of art to transcend borders and bridge cultures. It symbolizes the resilience and creativity of the human spirit, as artists continue to create and inspire despite the obstacles they face.
As we look forward to Art Basel Hong Kong 2025, let us celebrate the rich history of this vibrant city, the diverse talents of artists around the world, and the transformative power of art to shape our lives and society.
Art Basel Hong Kong 2024, ABHK24, Misc, Signage, UBS, UBS Building, PR, MC, . Art Basel has announced the exhibitors and first highlights for its 2025
Read the original article
by jsendak | Nov 1, 2024 | DS Articles
This post is the latest in our three part series on MLOps with Vetiver,
following on from:
In those blogs, we introduced the {vetiver} package and its use as a
tool for streamlined MLOps. Using the {palmerpenguins} dataset as an
example, we outlined the steps of training a model using {tidymodels}
then converting this into a {vetiver} model. We then demonstrated the
steps of versioning our trained model and deploying it into production.
Getting your first model into production is great! But it’s really only
the beginning, as you will now have to carefully monitor it over time to
ensure that it continues to perform as expected on the latest data.
Thankfully, {vetiver} comes with a suite of functions for this exact
purpose!
Preparing the data
A crucial step in the monitoring process is the introduction of a time
component. We will be tracking key scoring metrics over time as new data
is collected, therefore our analysis will now depend on a time dimension
even if our deployed model has no explicit time dependence.
To demonstrate the monitoring steps, we will be working with the World
Health Organisation Life
Expectancy
data which tracks the average life expectancy in various countries over
a number of years. We start by loading the data:
download.file("https://www.kaggle.com/api/v1/datasets/download/kumarajarshi/life-expectancy-who",
"archive.zip")
unzip("archive.zip")
life_expectancy = readr::read_csv("./Life Expectancy Data.csv")
We will attempt to predict the life expectancy using the percentage
expenditure, total expenditure, population, body-mass-index (BMI) and
schooling. Let’s select the columns of interest, tidy up the variable
names and drop any missing values:
life_expectancy = life_expectancy |>
janitor::clean_names(case = "snake",
abbreviations = c("BMI")) |>
dplyr::select("year", "life_expectancy",
"percentage_expenditure",
"total_expenditure", "population",
"bmi", "schooling") |>
tidyr::drop_na()
life_expectancy
#> # A tibble: 2,111 × 7
#> year life_expectancy percentage_expenditure total_expenditure population
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2015 65 71.3 8.16 33736494
#> 2 2014 59.9 73.5 8.18 327582
#> 3 2013 59.9 73.2 8.13 31731688
#> 4 2012 59.5 78.2 8.52 3696958
#> 5 2011 59.2 7.10 7.87 2978599
#> 6 2010 58.8 79.7 9.2 2883167
#> 7 2009 58.6 56.8 9.42 284331
#> 8 2008 58.1 25.9 8.33 2729431
#> 9 2007 57.5 10.9 6.73 26616792
#> 10 2006 57.3 17.2 7.43 2589345
#> # ℹ 2,101 more rows
#> # ℹ 2 more variables: bmi <dbl>, schooling <dbl>
The data contains a numeric year
column which will come in handy for
monitoring the model performance over time. However, the {vetiver}
monitoring functions will require this column to use <date>
("YYYY-MM-DD"
) formatting and it will have to be sorted in ascending
order:
life_expectancy = life_expectancy |>
dplyr::mutate(
year = lubridate::ymd(year, truncated = 2L)
) |>
dplyr::arrange(year)
life_expectancy
#> # A tibble: 2,111 × 7
#> year life_expectancy percentage_expenditure total_expenditure
#> <date> <dbl> <dbl> <dbl>
#> 1 2000-01-01 54.8 10.4 8.2
#> 2 2000-01-01 72.6 91.7 6.26
#> 3 2000-01-01 71.3 154. 3.49
#> 4 2000-01-01 45.3 15.9 2.79
#> 5 2000-01-01 74.1 1349. 9.21
#> 6 2000-01-01 72 32.8 6.25
#> 7 2000-01-01 79.5 347. 8.8
#> 8 2000-01-01 78.1 3557. 1.6
#> 9 2000-01-01 66.6 35.1 4.67
#> 10 2000-01-01 65.3 3.70 2.33
#> # ℹ 2,101 more rows
#> # ℹ 3 more variables: population <dbl>, bmi <dbl>, schooling <dbl>
Finally, let’s imagine the year is currently 2002, so our historical
training data should only cover the years 2000 to 2002:
historic_life_expectancy = life_expectancy |>
dplyr::filter(year <= "2002-01-01")
Later in this post we will check how our model performs on more recent
data to illustrate the effects of model drift.
Training our model
Before we start training our model, we should split the data into
“train” and “test” sets:
library("tidymodels")
data_split = rsample::initial_split(
historic_life_expectancy,
prop = 0.7
)
train_data = rsample::training(data_split)
test_data = rsample::testing(data_split)
The test set makes up 30% of the original data and will be used to score
the model on unseen data following training.
The code cell below handles the steps of setting up a trained model in
{vetiver} and versioning it using {pins}. For a more detailed
explanation of what this code is doing, we refer the reader back to
Part
1.
We will again use a basic K-nearest-neighbour model, although this
time we have set up the workflow as a regression model since we are
predicting a continuous quantity. Note that this requires the {kknn}
package to be installed.
# Train the model with {tidymodels}
model = recipe(
life_expectancy ~ percentage_expenditure +
total_expenditure + population + bmi + schooling,
data = train_data
) |>
workflow(nearest_neighbor(mode = "regression")) |>
fit(train_data)
# Convert to a {vetiver} model
v_model = vetiver::vetiver_model(
model,
model_name = "k-nn",
description = "life-expectancy"
)
# Store the model using {pins}
model_board = pins::board_temp(versioned = TRUE)
vetiver::vetiver_pin_write(model_board, v_model)
Here the model {pins} board is created using pins::board_temp()
which
generates a temporary local folder.
At this point we should check how our model performs on the unseen test
data. The maximum absolute error (mae
), root-mean-squared error
(rmse
) and R2 (rsq
) can be computed over a specified
time period using vetiver::vetiver_compute_metrics()
:
metrics = augment(v_model, new_data = test_data) |>
vetiver::vetiver_compute_metrics(
date_var = year,
period = "year",
truth = life_expectancy,
estimate = .pred
)
metrics
#> # A tibble: 9 × 5
#> .index .n .metric .estimator .estimate
#> <date> <int> <chr> <chr> <dbl>
#> 1 2000-01-01 46 rmse standard 4.06
#> 2 2000-01-01 46 rsq standard 0.836
#> 3 2000-01-01 46 mae standard 3.05
#> 4 2001-01-01 44 rmse standard 4.61
#> 5 2001-01-01 44 rsq standard 0.844
#> 6 2001-01-01 44 mae standard 3.43
#> 7 2002-01-01 36 rmse standard 4.14
#> 8 2002-01-01 36 rsq standard 0.853
#> 9 2002-01-01 36 mae standard 3.04
The first line of code here sends new data (in this case the unseen test
data) to our model and generates a .pred
column containing the model
predictions. This output is then piped to
vetiver::vetiver_compute_metrics()
which includes the following
arguments:
date_var
: the name of the date column to use for monitoring the
model performance over time.
period
: the period ("hour"
, "day"
, "week"
, etc) over which the
scoring metrics should be computed. We are restricted by our data to
using "year"
; for more granular data it may be more sensible to
monitor the model over shorter timescales.
truth
: the actual values of the target variable (in our example this
is the life_expectancy
column of the test data).
estimate
: the predictions of the target variable to compare the
actual values against (in our example this is the .pred
column
computed in the previous step).
We will come back to these metrics later in this post, so for now let’s
store them along with our model using {pins}:
pins::pin_write(model_board, metrics, "k-nn")
We will skip over the details of deploying our model since this is
already covered in Part
2.
Monitoring our model
Over time we may notice our model start to drift, where its
predictions gradually diverge from the truth as the data evolves. There
are two common causes of this:
- Data drift: the statistical distribution of an input variable
changes.
- Concept drift: the relationship between the target and an input
variable changes.
Taking the example of life expectancy data:
- A country’s expenditure is expected to vary over time due to changes
in government policy and unexpected events like pandemics and economic
crashes. This is data drift.
- Advances in medicine may mean that life expectancy can improve even if
BMI remains unchanged. This is concept drift.
Going back to our model which was trained using data from 2000 to 2002,
let’s now check how it would perform on “future” data up to 2010:
# Generate "new" data from 2003 to 2010
new_life_expectancy = life_expectancy |>
dplyr::filter(year > "2002-01-01" &
year <= "2010-01-01")
# Score the model performance on the new data
new_metrics = augment(v_model, new_data = new_life_expectancy) |>
vetiver::vetiver_compute_metrics(
date_var = year,
period = "year",
truth = life_expectancy,
estimate = .pred
)
new_metrics
#> # A tibble: 24 × 5
#> .index .n .metric .estimator .estimate
#> <date> <int> <chr> <chr> <dbl>
#> 1 2003-01-01 141 rmse standard 5.21
#> 2 2003-01-01 141 rsq standard 0.760
#> 3 2003-01-01 141 mae standard 3.64
#> 4 2004-01-01 141 rmse standard 5.14
#> 5 2004-01-01 141 rsq standard 0.761
#> 6 2004-01-01 141 mae standard 3.60
#> 7 2005-01-01 141 rmse standard 5.83
#> 8 2005-01-01 141 rsq standard 0.684
#> 9 2005-01-01 141 mae standard 4.19
#> 10 2006-01-01 141 rmse standard 6.23
#> # ℹ 14 more rows
Let’s now store the new metrics in the model {pins} board (along with
the original metrics):
vetiver::vetiver_pin_metrics(
model_board,
new_metrics,
"k-nn"
)
We can now load both the original and new metrics then visualise these
with vetiver::vetiver_plot_metrics()
:
# Load the metrics
monitoring_metrics = pins::pin_read(model_board, "k-nn")
# Plot the metrics
vetiver::vetiver_plot_metrics(monitoring_metrics) +
scale_size(name = "Number ofnobservations", range = c(2, 4)) +
theme_minimal()
The size of the data points represents the number of observations used
to compute the metrics at each period. Up to 2002 we are using the
unseen test data to score our model; after this we are using the full
available data set.
We observe an increasing model error over time, suggesting that the
deployed model should only be trained using the latest data. For this
particular data set it would be sensible to retrain and redeploy the
model annually.
Summary
In this blog we have introduced the idea of monitoring models in
production using the Vetiver framework. Using the life expectancy data
from the World Health Organisation as an example, we have outlined how
to track key model metrics over time and identify model drift.
As you start to retire your old models and replace these with new models
trained on the latest data, make sure to keep ALL of your models (old
and new) versioned and stored. That way you can retrieve any historical
model and establish why it gave a particular prediction on a particular
date.
The {vetiver} framework also includes an R Markdown template for
creating a model monitoring dashboard. For more on this, check out the
{vetiver}
documentation.
The next post in our Vetiver series will provide an outline of the
Python framework. Stay tuned for that sometime in the new year!
For updates and revisions to this article, see the original post
Continue reading: Vetiver: Monitoring Models in Production
Long-term implications and future developments: A deep dive into MLOps with Vetiver
This article centers around the third part of MLOps with Vetiver, focusing on the importance of monitoring models after they have been deployed. Through a detailed discussion on how to prepare, train, and monitor a model, various key points could be discerned.
The importance of model monitoring
Getting your first model into production is only the starting point. Ensuring its continuous performance on the latest data becomes crucial once it’s up and running. The Vetiver framework provides a suite of functions specifically designed for this purpose, allowing you to monitor your model’s performance over time in the context of changes in both input and output variables or relationships.
Concepts of data drift and concept drift
The piece also introduces concepts like data drift and concept drift. Data drift refers to the statistical distribution of an input variable changing over time. Meanwhile, concept drift means the relationship between the target and an input variable shifting. Ensuring that a model continues to perform well despite these changes is a primary goal of monitoring.
Actionable advice on how to approach monitoring
The detailed explanation leaves us with some useful actionable insights:
- Adopt a continuous approach to model monitoring, tracking key scoring metrics as data evolves.
- Introduce a time component during the monitoring process, as analysis becomes dependent on this.
- Ensure that your models are retrained with the latest data periodically, or as often as necessary depending on the application.
- Maintain versioning of your models rigorously, this facilitates better tracking of changes and understanding the potential cause of issues.
- If the model comes to a point where it errs consistently, it’s wiser to retire old models and replace them with newer ones trained on the latest data.
Future developments with Vetiver
In terms of what’s next for Vetiver, an outline of the Python framework is on the horizon. This could potentially open up new opportunities and make Vetiver useful to a wider audience, not just those using R.
Additionally, Vetiver also provides an R Markdown template for creating model monitoring dashboards. This will provide a more hands-on, visual tool for monitoring models and should aid better decision-making for data scientists in the future.
The ongoing development of Vetiver signifies a focus on closing the gap between model development and deployment, providing robust tools for the maintenance and optimization of models in production.
Read the original article
by jsendak | Nov 1, 2024 | DS Articles
This article presents a comprehensive discussion of when to choose which approach for your LLM and potential hybrid solutions.
Understanding the Best Approach for Your LLM and Possible Hybrid Solutions
Choosing the right approach for your Master of Laws (LLM) program has broader implications and can significantly shape your future legal profession. With the potential to utilize hybrid solutions, the way students learn and enact the law is transforming dramatically. This article dives deep into understanding these critical aspects and provides actionable insights to successfully navigate your legal education journey.
Long-term implications
Deciding on the right study model for your LLM program is impactful on a personal, academic and professional level. The traditional vs. hybrid learning approaches each carry significant long-term consequences in shaping a law professional’s career trajectory.
A traditional LLM program, through its rigorous classroom and campus engagement, can facilitate extensive professional networks and a deep immersion into subjects. On the contrary, a hybrid learning system may offer a balance between personal commitments and academic pursuits and provide opportunities to learn and apply the law in real-world settings. Your choice would be influenced by factors such as academic goals, learning style, career ambitions, and personal circumstances.
Potential Future Developments
With the advent of digital technology, online education, and distance learning are continually evolving and reshaping traditional educational systems. Hybrid LLM programs that fuse traditional and digital learning experiences are poised to become more prevalent in the future.
This shift towards more flexible learning models is likely to impact how future law professionals communicate, practise, and understand law in a rapidly evolving society. With virtual classrooms, online study materials, AI-powered tutoring, and digital law libraries, hybrid learning can provide an enriching learning experience, bridging the gap between traditional and modern-day legal practices.
Actionable Advice
- Evaluate Your Priorities: Before choosing between traditional and hybrid LLM programs, reflect on your personal, academic, and professional goals. Consider factors like time commitment, flexibility, financial capacity, and learning preferences.
- Research Thoroughly: Learn about various LLM programs by exploring university websites, speaking to faculty members or alumni and attending educational fairs or workshops.
- Consider Future Trends: Keep an eye on the latest developments in the legal education sector. Awareness of emerging trends such as virtual classrooms or online tutoring can help inform your decision.
- Seek Expert Advice: If possible, consult with counselors, academic advisors or legal professionals to gain further insights and make an informed decision.
In conclusion, the decision on the approach for your LLM program needs a careful consideration of a variety of factors, with a keen view towards future trends in legal education. Identifying the right balance can lead to a rewarding career path with opportunities for growth and advancement.
Read the original article
by jsendak | Nov 1, 2024 | DS Articles
Image by Malachi Witt from Pixabay I had the chance to listen to a talk that Ivan Lee, Founder and CEO of Datasaur, gave during October 2024’s AI Summit here at the Computer History Museum in Mountain View, CA. Datasaur is an AWS partner and marketplace seller. One of the things Lee and company have… Read More »What to consider when selecting a learning model
The Future of Learning Models: Insights from Datasaur’s CEO
October 2024 witnessed an enthusiastic speech at the AI Summit from Ivan Lee, the Founder and CEO of Datasaur – an AWS partner and a market seller in the prospering industry of data science and analytics. Distilling his words, we look into the long-term implications and predict the possible future developments for learning models within the AI landscape.
Key Points
Though unfortunately, the full content of the speech wasn’t provided in the summarized text, the information about Lee’s position and his company’s role in the AI and data analytics landscape potentially offer valuable insights. As an AWS partner and a market seller, Datasaur’s perspective on learning models suggests the significance of scalability, flexibility and accuracy in AI-driven applications in the near future.
Future Developments
The technology world has always been fast-paced, especially when it comes to AI and machine learning. The influence of Lee’s company and the space it exists in reveal several key trends that are likely to shape AI’s future.
- Increased Demand for Scalability: As more businesses find themselves reliant on AI systems, there will be an increase in the demand for models that can scale effectively, allowing for simultaneous handling of multiple operations.
- Need for Flexibility:Data often comes in different forms and formats. Lessons from Datasaur specify importance of flexible learning models that can accommodate and process various types of data.
- Heightened Importance of Accuracy: As AI plays a more central role in business operations, its accuracy becomes more critical to the business’ bottom line. The demand for more accurate learning models is likely to surge.
Long Term Implications
These key trends come with long-term implications. The demand for scalable, flexible and accurate learning models shifts the focus towards advanced research and development in the field of AI and machine learning. Organizations and companies, like Datasaur, will need to invest significantly in R&D, ultimately driving further technological advancements.
Actionable Advice for Businesses
- Adopt Flexibility: Businesses must seek out learning models that can process a variety of data types, thus catering to the diverse needs of any organization.
- Scale Wisely: As businesses grow, it’s essential they utilize scalable learning models that can grow with them.
- Strive for Accuracy: Implementing learning models which prioritize accuracy can be crucial for delivering precise insights and making sound decisions.
- Invest in Research: With the changing AI landscape, businesses need to consider investing in research and development to stay ahead of the market and maintain competitiveness.
“The AI landscape is evolving rapidly. Businesses must not only adapt to these changes, but proactively strive for progress”
Read the original article
by jsendak | Nov 1, 2024 | AI
Machine unlearning (MU) has gained significant attention as a means to remove specific data from trained models without requiring a full retraining process. While progress has been made in…
Machine unlearning (MU) has emerged as an innovative technique to selectively erase certain data from trained models, eliminating the need for extensive retraining. This article explores the advancements and challenges in the field of MU, highlighting its potential to enhance privacy, mitigate bias, and improve model performance. Despite notable progress, researchers are still grappling with the complexities of MU, including the identification of relevant data, the development of efficient unlearning algorithms, and the potential impact on model interpretability. By delving into these core themes, this article sheds light on the promising future of machine unlearning and its implications for the evolving landscape of artificial intelligence.
Machine unlearning (MU) has gained significant attention as a means to remove specific data from trained models without requiring a full retraining process. While progress has been made in developing MU techniques, there are underlying themes and concepts that deserve exploration in a new light, accompanied by innovative solutions and ideas.
The Ethical Dimension
One of the underlying themes in machine unlearning is the ethical dimension. As AI becomes more integrated into our lives, it is crucial to consider the impact of biased or erroneous data on trained models. MU presents an opportunity to rectify these issues by selectively removing problematic data points. However, the ethical responsibility falls on developers to ensure a fair and unbiased process of unlearning.
To address this, innovative solutions can be implemented that require developers to thoroughly analyze the removed data and question its potential biases. An algorithm could be designed to identify patterns of discrimination or misinformation within the data, flagging them for human review. This human oversight would ensure that the unlearning process aligns with ethical guidelines and promotes fairness.
Privacy and Data Protection
Another crucial theme in machine unlearning is privacy and data protection. As we entrust AI systems with more personal information, the ability to selectively unlearn sensitive data becomes imperative. MU provides a solution by allowing the removal of individual data points, enabling a balance between retaining model accuracy and safeguarding privacy.
Innovative ideas for data protection in MU could involve a combination of encryption techniques and differential privacy. Encrypted machine unlearning would allow for secure removal of specific data points without compromising privacy. Additionally, integrating differential privacy mechanisms during unlearning would add an extra layer of protection by ensuring that individual data points cannot be re-identified.
Dynamic and Continual Learning
Machine unlearning also raises the concept of dynamic and continual learning. Traditional machine learning models are trained on static datasets, limiting their ability to adapt and evolve as new data emerges. MU opens up possibilities for incorporating continual learning methodologies, allowing models to unlearn outdated or irrelevant data on the fly.
An innovative solution in this realm could be the development of an adaptive unlearning framework. This framework would analyze the relevance and accuracy of data over time, enabling continuous model refinement through targeted unlearning. By unlearning outdated data and focusing on recent and more relevant information, models can better adapt to changing circumstances and improve their performance in real-world applications.
Conclusion: Machine unlearning is an emerging field that presents exciting opportunities for improving the fairness, privacy, and adaptability of AI systems. By exploring the ethical dimension, prioritizing privacy and data protection, and incorporating dynamic learning methodologies, we can unlock the true potential of machine unlearning. As developers and researchers delve further into this field, it is paramount to consider these underlying themes and concepts, constantly innovating and iterating on our approaches to create responsible, robust, and continually improving AI systems.
the field of machine learning, there are still challenges to overcome in the area of machine unlearning. The ability to selectively remove specific data from trained models is crucial for addressing privacy concerns, ensuring regulatory compliance, and handling biases that may have been unintentionally learned by the model.
One of the key advancements in machine unlearning is the development of algorithms that can identify and remove specific instances or patterns from the trained model without the need for retraining. This is particularly important in situations where certain data points or attributes need to be forgotten due to legal or ethical reasons. For example, in the case of personal data that should no longer be stored or used, machine unlearning can help ensure compliance with privacy regulations such as the General Data Protection Regulation (GDPR).
Another area where machine unlearning can be beneficial is in addressing biases that may exist within trained models. Biases can arise from the data used for training, reflecting societal prejudices or unequal representation. With machine unlearning, problematic biases can be identified and selectively removed, allowing for fairer and more unbiased decision-making processes.
However, there are several challenges that need to be addressed for machine unlearning to be widely adopted. One challenge is the lack of standardized techniques and frameworks for machine unlearning. As of now, there is no widely accepted approach or set of guidelines for implementing machine unlearning in different scenarios. This makes it difficult for researchers and practitioners to compare and replicate results, hindering the progress in this field.
Another challenge is the potential loss of performance or accuracy when removing specific data from trained models. Removing certain instances or patterns may lead to a decrease in the model’s overall performance, as the removed data might have contributed to the model’s ability to generalize and make accurate predictions. Balancing the removal of unwanted data with the preservation of model performance is a complex task that requires further research and development.
Looking ahead, the future of machine unlearning holds promise. As the field matures, we can expect to see the emergence of standardized techniques and frameworks, enabling more consistent and reliable machine unlearning processes. Additionally, advancements in explainable AI and interpretability will play a crucial role in understanding the impact of data removal on model behavior and performance.
Furthermore, the integration of machine unlearning within larger machine learning pipelines and frameworks will be essential. This will require seamless integration with existing model training and deployment processes, ensuring that machine unlearning becomes an integral part of the machine learning lifecycle.
In conclusion, machine unlearning has gained attention for its potential to selectively remove specific data from trained models. While progress has been made, challenges remain, such as the lack of standardized techniques and the potential loss of performance. However, with further research and development, machine unlearning has the potential to enhance privacy, address biases, and improve the overall fairness and transparency of machine learning systems.
Read the original article