Deploying a Linux Server for Academic Research
Recently, I repurposed a MiniPC into a server running the Ubuntu operating system, configuring it to meet potential future research needs.
Given my limited knowledge of Linux, I spent some time learning and attempting to complete the deployment, documenting the configuration steps, potential issues, and their corresponding solutions.
Creating a New User with Administrator Privileges on Ubuntu
When using a cloud service, a vendor-provided, or a self-configured Ubuntu server, a default user is created. Typically, this user can temporarily elevate privileges using the sudo
command to perform tasks requiring administrator rights, but it remains essentially a regular user (not an administrator account). If you need to create a custom user with the same administrator privileges, follow these steps:
1. Create a New User and Specify Home Directory, Login Shell
sudo useradd -d "/home/<user_name>" -m -s "/bin/bash" <user_name>
Parameter Description:
-
-d "/home/<user_name>"
: Sets the user’s home directory to/home/<user_name>
. -
-m
: Automatically creates the home directory. -
-s "/bin/bash"
: Specifies the user’s default login shell as/bin/bash
.
2. Grant Administrator Privileges to the New User
To create a user with sudo (administrator) privileges, run the following command:
sudo useradd -d "/home/<user_name>" -m -s "/bin/bash" -G sudo <user_name>
Where:
-
-G sudo
: Adds<user_name>
to thesudo
group, granting it administrator privileges.
3. Set the New User’s Password
The newly created user has no password by default. Set a password for <user_name>
using the following command:
sudo passwd <user_name>
After running this command, the system will prompt you to enter the password twice. Note that for security reasons, no characters will be displayed when entering the password (no text or prompts will appear). Simply enter the password and press Enter
to confirm.
By following the above steps, you can successfully create a new user <user_name>
with administrator privileges, who can execute commands using sudo
.
Terminal Beautification
A good-looking and easy-to-use terminal prompt can increase work pleasure. Here, zsh
is used to beautify the terminal. Refer to previous records for related operations.
Enabling Remote Access to the Server
To enable remote access to the Ubuntu server from a non-physical location, you can install and configure the SSH service and perform the necessary firewall configurations.
1. Install and Configure SSH Service
If only basic SSH access is required, installing openssh-server
and enabling the SSH service is sufficient. However, in environments with higher security requirements, you can further optimize the SSH configuration file /etc/ssh/sshd_config
:
sudo apt update
sudo apt install openssh-server
After completing the installation, check the status of the SSH service to ensure it is running normally:
sudo systemctl status ssh
-
Configuration Suggestions:
-
Disable Root User Direct Login (Recommended): Avoid logging into the server directly as root to increase security. Find
PermitRootLogin
in/etc/ssh/sshd_config
and set it tono
.sudo nano /etc/ssh/sshd_config
PermitRootLogin no
-
Limit Allowed Users: You can specify users allowed to log in via SSH through the
AllowUsers
configuration item, further enhancing security.AllowUsers <user_name>
-
Use a Non-Default Port (Optional): Change the SSH port from the default 22 to another port, such as 2200. This method can reduce the possibility of scanning attacks, but you need to update the firewall rules at the same time.
Port 2200
-
Enable Automatic Disconnection of Idle Connections: To avoid unused SSH sessions from occupying resources for a long time, you can add the following lines to the configuration to set automatic disconnection of idle connections.
ClientAliveInterval 300 ClientAliveCountMax 2
-
After making the changes, restart the SSH service to apply the configuration:
sudo systemctl restart ssh
2. Configure UFW Firewall
If the UFW firewall is enabled, make sure to open the SSH port and make corresponding settings depending on whether the port number has been changed. If you are using the default port 22, you can directly use:
sudo ufw allow ssh
If you have changed the port number, such as setting it to 2200, you need the following command:
sudo ufw allow 2200/tcp
-
Configuration Suggestions:
-
After enabling the firewall, check the status to ensure the rules are applied correctly:
sudo ufw enable sudo ufw status
-
3. Verify SSH Connection
Test the connection to the server on the client system (such as Windows). It is recommended to use terminal tools that support the SSH protocol, such as PuTTY or Windows Terminal in Windows, to ensure the security and stability of the connection.
To test the remote connection to the server on a Windows system, you can use the telnet command (note: telnet is often used to test connections, but a more secure SSH client should be used in a production environment):
telnet <remote_ip> <remote_port>
Replace <remote_ip>
with the IP address of the server, and <remote_port>
with the SSH port the server is listening on (default is 22).
4. Fix .Xauthority File Permission Issues
Incorrect permissions for the /home/<user_name>/.Xauthority
file may be due to the file being created without root privileges. In addition to modifying the ownership of the user directory, ensure that the permissions of the relevant SSH session directory are also correct:
sudo chown <user_name>:<user_name> -R /home/<user_name>
If the problem persists, you can try creating a new .Xauthority
file:
sudo -u <user_name> touch /home/<user_name>/.Xauthority
sudo chown <user_name>:<user_name> /home/<user_name>/.Xauthority
5. Set Up Fail2Ban (Recommended)
To further protect the SSH service from brute-force attacks, you can install and configure fail2ban
. This tool automatically detects multiple failed login attempts and temporarily disables the corresponding IP:
sudo apt install fail2ban
fail2ban
will automatically enable SSH protection. You can also customize the /etc/fail2ban/jail.local
file to adjust parameters such as the ban time and number of retries:
[sshd]
enabled = true
port = 22
maxretry = 5
bantime = 600
Fail2Ban will automatically identify multiple failed login attempts and disable the corresponding IP, further ensuring server security.
Configuring SSH Connection between Server and GitHub
The following are detailed steps to configure an SSH connection between an Ubuntu server and GitHub, ensuring you can securely clone, push, and pull repositories on GitHub.
1. Install and Verify Git
First, install Git and confirm the installed version:
sudo apt install git
git --version
2. Configure Git User Information
Configure Git with your GitHub username and email. Make sure to fill in the information consistent with your GitHub account so that the author identity is correctly recorded when submitting code:
git config --global user.name "<github_account_name>"
git config --global user.email "<github_account_email>"
The above configuration will be added to the ~/.gitconfig
file, which is a global setting, that is, applied to all Git repositories under this user.
3. Generate SSH Key Pair
To establish a secure connection with GitHub on the server, you need to generate an SSH key pair:
ssh-keygen -C "<github_account_email>" -t rsa
- Description:
-
-C "<github_account_email>"
: Add a comment to the key, usually the email address of the GitHub account. -
-t rsa
: Specifies the key type as RSA (a commonly used type supported by GitHub).
-
After running the command, press Enter
three times (that is, keep the default file name id_rsa
and the default no password setting). The key pair will be stored in the ~/.ssh
directory.
4. Add SSH Public Key to GitHub
-
Use the following command to open the generated public key file and copy its contents:
cat ~/.ssh/id_rsa.pub
This command will print the public key to the terminal. You can optionally open and copy it using a text editor (for example, using
vim ~/.ssh/id_rsa.pub
). -
Log in to the GitHub website and navigate to
Settings
→SSH and GPG keys
→New SSH key
. -
Paste the content from
id_rsa.pub
into theNew SSH key
page, and set a descriptive name for this key (such asUbuntu Server Key
), and then save it.
5. Test SSH Connection with GitHub
After the configuration is complete, test the connection with GitHub using the following command:
ssh -T git@github.com
When executing this command, GitHub will return a message confirming the connection is successful, for example:
Hi <github_account_name>! You've successfully authenticated, but GitHub does not provide shell access.
This information indicates that the SSH connection has been successfully established, and you can perform code push and pull operations with GitHub on the server.
6. Common Issues and Solutions
-
SSH Key Permission Issues: Ensure that the permissions of the SSH key pair files are correct to prevent connection problems. Check and set the permissions of the key:
chmod 600 ~/.ssh/id_rsa chmod 644 ~/.ssh/id_rsa.pub
-
Add Key to SSH Agent (Recommended): If the key is not automatically loaded after the server starts, you can add it to the SSH Agent so that it is automatically loaded after each restart:
eval "$(ssh-agent -s)" ssh-add ~/.ssh/id_rsa
Python Environment Configuration and Management
Miniforge
For managing Python Scientific environments on the server, I chose lightweight and efficient Miniforge as the package management tool, replacing Anaconda. Miniforge defaults to the conda-forge channel and integrates Mamba, providing a faster solution for package management. Below are the steps to install and configure Miniforge and create and delete environments.
1. Install Miniforge
First, follow the installation instructions in the Miniforge
GitHub project page to download and install. The following are the core installation commands:
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
-
Configuration Suggestions:
- It is recommended to install Miniforge in
/usr/local/miniforge3
, so that multiple users can share the environment, but only the root user can modify it. During the installation process, you can choose the installation directory, and the system will automatically create the required folders. -
If you are using ZSH, you should confirm that the Miniforge path is added to the
.zshrc
file.export PATH="$usr/local/miniforge3/bin:$PATH"
- Reload the
.zshrc
configuration:source ~/.zshrc
- It is recommended to install Miniforge in
2. Initialize Mamba Environment
After the installation is complete, initialize Mamba so that the command can be used in the terminal. Assuming Miniforge is installed in /usr/local/miniforge3
, execute the following command:
/usr/local/miniforge3/bin/mamba init
This command will initialize the environment variables, and you need to restart the terminal to apply the changes.
If it does not work with ZSH, you can copy the mamba
configuration from .bashrc
to .zshrc
, or configure it in other ways.
3. Create and Manage Environments
In a multi-user server environment, it is recommended to create and manage environments with root privileges (switched via sudo su
). This approach avoids ordinary users from accidentally or uncontrollably changing the environment configuration.
Create a New Environment
# Create a new environment with the specified name
mamba create --name <new_env_name> python=3.11 --no-default-packages
-
--name <new_env_name>
: Name the new environment. -
python=3.11
: Set the Python version to 3.11. -
--no-default-packages
: Only install Python, do not automatically install other packages. - Ordinary users can create private environments in their home directories (for example,
/home/username/myenv
), execute the following command:
mamba create --prefix /home/username/myenv python=3.11 --no-default-packages
Delete Environment
To delete an environment (including all associated packages), you can use the following command:
# Delete by environment name
mamba remove --name <env_name> --all
# Delete by path
mamba remove --prefix /path/to/directory --all
4. Install Common Packages
It is usually recommended not to directly modify the base
environment, but to create a dedicated environment according to actual needs. For example, you may need to install JupyterHub
, which provides Jupyter notebook services for multiple users.
# Create Machine Learning environment
mamba create --name ml_env python=3.12 jupyterhub jupyterlab notebook notebook scipy numpy numpy
# Activate environment
mamba activate ml_env
# Download other packages
mamba install jupyter-lsp-python jupyterlab-lsp jupyterlab-git jupyterlab_execute_time
5. Prevent Unauthorized Updates
In a multi-user environment, restricting ordinary users from modifying system-level environments helps maintain the stability of the environment. If a user tries to update the environment, they will encounter an insufficient permissions error. An example is as follows:
mamba update --all
Error message:
EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
environment location: /usr/local/miniforge3
uid: 1000
gid: 1000
This design ensures the security and consistency of the environment and avoids update failures due to permission issues.
If you need to temporarily give permissions to update the base
environment, you can use the sudo
command:
sudo /usr/local/miniforge3/bin/mamba update --all
Poetry
Poetry is an efficient and convenient Python project dependency management tool, suitable for quickly creating and managing virtual environments, installing dependency libraries, and publishing Python packages.
1. Install Poetry
Before installing Poetry, you should ensure that Python 3.7 or above is installed.
-
Use the official installation script:
The Poetry installation script can automatically install Poetry in the
$HOME/.local/bin
directory. Run the following command to download and execute the installation script:curl -sSL https://install.python-poetry.org | python3 -
-
Add Poetry to PATH:
By default, after the installation is complete, you need to add Poetry to the environment variables. If a
command not found
error occurs, you can add it to the environment variables of the current session using the following command:export PATH="$HOME/.local/bin:$PATH"
-
Verify the installation:
After the installation is complete, you can check the installation version through the following command to confirm whether the installation is successful:
poetry --version
2. Create a New Project
Poetry provides simplified commands to quickly generate the basic structure of a new project.
-
Create a new project:
Use the following command to create a new project directory (for example,
my_project
) and generate the defaultpyproject.toml
file.poetry new my_project
This command will generate the following structure in the project directory:
my_project/ ├── my_project/ │ └── __init__.py ├── pyproject.toml └── tests/ └── __init__.py
-
Initialize an existing project (optional):
If a project already exists and you want to use Poetry for management, you can initialize the project through
poetry init
. This command will guide the generation of thepyproject.toml
file and configure initial dependencies:cd existing_project poetry init
3. Manage Project Dependencies
Poetry provides a convenient dependency management method, distinguishing between production dependencies and development dependencies.
-
Add production dependencies:
Add dependencies to the production environment, such as the
requests
library:poetry add requests
-
Add development dependencies:
If some libraries are only used for development and testing environments, you can use the
--dev
parameter. This parameter adds the dependency to the[tool.poetry.dev-dependencies]
section. For example, addpytest
as a development dependency:poetry add pytest --dev
-
Install all dependencies:
After the project dependencies are written to the
pyproject.toml
file, you can use the following command to install all dependencies:poetry install
poetry install
will automatically create a virtual environment and install the required dependencies in the virtual environment. If thepoetry.lock
file already exists, it will ensure that the installed dependency version is consistent with the version in the lock file to ensure environment consistency.
4. Manage Virtual Environments
Poetry creates virtual environments outside the project directory and automatically activates and uses the virtual environment.
-
Activate the virtual environment:
You can activate the virtual environment created by Poetry through the following command:
poetry shell
-
Exit the virtual environment:
After finishing work in the virtual environment, enter
exit
to exit. -
View the virtual environment path:
If you need to view the actual storage path of the virtual environment, you can use:
poetry env info --path
-
Delete the virtual environment (optional):
If you need to recreate the virtual environment or clean up the environment, you can delete the virtual environment:
poetry env remove python
5. Manage Dependency Lock Files
Poetry uses the poetry.lock
file to lock the exact version of the dependency to ensure consistency across environments.
-
Update dependency version:
When you need to update the dependency version, you can use the following command to re-parse the dependencies and update the lock file:
poetry update
-
Install the specified lock version:
In collaborative projects, team members can install the exact dependency version of the project based on the
poetry.lock
file:poetry install
6. Run Scripts and Commands
Poetry supports running scripts or commands directly in the virtual environment, simplifying command management.
-
Run project script:
Use
poetry run
to execute commands in the virtual environment. For example, execute a Python script:poetry run python script.py
-
Run unit tests directly:
You can directly run test commands in the virtual environment, such as
pytest
:poetry run pytest
7. Publish Python Packages
Poetry can publish projects to PyPI or other custom package repositories.
-
Build the project:
Poetry provides a one-click function to build the project, package the project into
.whl
and.tar.gz
files, and prepare for publishing:poetry build
-
Publish to PyPI:
To publish the package to PyPI, you need to configure PyPI credentials in the
~/.pypirc
file, or use Poetry’spublish
command to enter interactively:poetry publish --build
Note: To verify the publishing process in the test environment, you can use the
--repository
parameter to publish to the PyPI test repository.poetry publish --repository testpypi
The following is a configuration guide for using the R environment in the field of econometrics, including the installation of R and RStudio, commonly used R package configurations, etc.
Configuring R Environment on Ubuntu for Econometric Analysis
In the field of econometrics, R is suitable for processing economic data, performing regression analysis, time series analysis, and other tasks.
1. Install R
Ubuntu’s default software repository contains R, but it may not be the latest version. To get the latest version of R, you can use the CRAN repository.
-
Add CRAN repository:
Update the package list and install the necessary dependencies:
sudo apt update sudo apt install software-properties-common dirmngr -y
-
Add GPG key for R project:
Download and add CRAN’s GPG public key to ensure the integrity of the package:
wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | sudo tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
-
Verify GPG key (optional):
Verify the fingerprint of the key (
E298A3A825C0D65DFD57CBB651716619E084DAB9
):gpg --show-keys /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
-
Add CRAN repository to source list:
Add the CRAN repository to the apt source list to ensure you get the latest version of R:
sudo add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
-
Install R and development packages:
Install the latest version of R and development libraries:
sudo apt install r-base r-base-dev -y
2. Install RStudio
RStudio is a powerful IDE suitable for data analysis and visualization. The installation process of RStudio is simple. For specific installation steps, please refer to RStudio official installation guide.
Installation steps:
- Download the latest version of RStudio Server.
-
Use the
dpkg
command to install:sudo dpkg -i rstudio-server-<version>.deb
-
Check the installation status:
sudo systemctl status rstudio-server
After RStudio Server is successfully installed, you can access it in your browser via http://<your-server-ip>:8787
.
3. Install Econometrics-Related R Packages
Econometric analysis usually requires specialized data processing, regression analysis, and time series analysis packages.
-
Install system dependencies:
Install system-level development packages to ensure smooth installation of R packages (especially for some packages that need to be compiled, such as
tidyverse
anddata.table
):sudo apt-get install build-essential libssl-dev libcurl4-openssl-dev libxml2-dev
-
Install econometrics and data processing packages:
Start the R console and install commonly used econometrics packages:
sudo R
Execute the following commands in the R console:
chooseCRANmirror(graphics = FALSE) install.packages(c("tidyverse", "data.table", "broom", "plm", "forecast", "lmtest", "sandwich", "stargazer"))
-
tidyverse
: Includes data processing and visualization packages such asdplyr
,ggplot2
, andtidyr
. -
data.table
: Used for fast data processing. -
broom
: Organize regression analysis results into easy-to-analyze tables. -
plm
: Used for panel data regression analysis. -
forecast
: Used for time series forecasting and analysis. -
lmtest
andsandwich
: Provide econometric test tools and robust standard errors. -
stargazer
: Used to output regression results into easy-to-understand tables, suitable for papers or reports.
-
-
Install advanced econometrics tools:
-
AER
(Applied Econometrics with R): Contains commonly used functions and data sets for economic research. -
urca
: Provides unit root and cointegration tests. -
vars
: Used for vector autoregressive (VAR) analysis.
install.packages(c("AER", "urca", "vars"))
-
-
Install financial time series analysis packages (optional):
-
quantmod
andTTR
: Used for financial market data analysis and technical indicator calculation. -
zoo
andxts
: Process irregular time series data.
install.packages(c("quantmod", "TTR", "zoo", "xts"))
-
4. Install Private Packages Using GitHub Token
If you need to install some experimental or custom packages from GitHub, it is recommended to use a GitHub token to avoid API rate limits.
Generate GitHub token:
-
Generate a token in the R console:
usethis::create_github_token()
-
Generate a new Personal Access Token on the GitHub website (path:
Settings
→Developer settings
→Personal access token
→Tokens (classic)
). -
Configure GitHub token:
Add the generated token to the R environment:
gitcreds::gitcreds_set()
Under this setting, you can safely install private packages from GitHub and avoid API restrictions.
5. Example: Install and Use the plm
Package for Panel Data Regression
Assuming you want to use panel data for regression analysis, here is an example of how to install plm
and execute a fixed effects model:
-
Install
plm
(if not installed):install.packages("plm")
-
Load and use
plm
:library(plm) # Create sample panel data set data("Produc", package = "plm") pdata <- pdata.frame(Produc, index = c("state", "year")) # Fixed effects model regression fe_model <- plm(log(gsp) ~ log(pcap) + log(hwy) + log(water) + log(util), data = pdata, model = "within") summary(fe_model)
The following is a guide to organize and optimize to help synchronize data, install fonts, and deal with Nvidia drivers and library-related issues more effectively.
Other issues
1. Synchronize data
To synchronize local data to a remote server, it is recommended to use rsync
, which is an efficient and reliable synchronization method. More details should be referred to this tutorial.
-
Synchronization command:
rsync -r /path/to/sync/ <username>@<remote_host>:<destination_directory>
The above command will “push” all the contents of the local directory
/path/to/sync/
to the<destination_directory>
of the remote server. -
Monitor the transmission progress of large files:
If you are synchronizing large files, you can use the
watch
command to monitor the synchronization progress:watch -n <time_interval> du -sh /path/to/large/file
This command will refresh the file size change every
<time_interval>
seconds.
2. Install common fonts
The Linux system does not come with some common fonts in Windows by default (such as Arial
and Times New Roman
). Installing these fonts can improve the display effect of documents and websites, especially those that rely on these fonts for graphic output. Install the Microsoft TrueType core font package and refresh the cache:
sudo apt install msttcorefonts
rm -rf ~/.cache/matplotlib
-
msttcorefonts
includes a variety of Microsoft fonts, such as Arial and Times New Roman. - The second command deletes the
matplotlib
cache directory to ensure that the updated fonts are loaded correctly.
3. Driver/library version mismatch
When running nvidia-smi
, if the following error occurs:
Failed to initialize NVML: Driver/library version mismatch
You can refer to the solution on Stack Overflow. The brief steps are as follows:
-
Restart the server: In some cases, restarting the server can solve the problem:
sudo reboot
-
Uninstall and reinstall Nvidia driver:
If restarting is invalid, try the following command to clear the existing Nvidia driver and reinstall it:
sudo apt purge nvidia* libnvidia* sudo ubuntu-drivers install sudo reboot
4. Upgrade Nvidia driver
To upgrade the Nvidia driver, you can follow these steps:
-
Uninstall the old driver:
sudo apt purge *nvidia* -y sudo apt remove *nvidia* -y sudo rm /etc/apt/sources.list.d/cuda* sudo apt autoremove -y && sudo apt autoclean -y sudo rm -rf /usr/local/cuda*
-
Find and install the recommended driver:
Run the following command to find the recommended Nvidia driver version:
ubuntu-drivers devices
Or install the specified version through the following command (for example,
550
, adjust the version number according to system requirements):sudo apt install libnvidia-common-550-server libnvidia-gl-550-server nvidia-driver-550-server -y
-
Restart and check:
Restart the server and use
nvidia-smi
to check whether the new driver is running normally:sudo reboot now
If
nvidia-smi
returns the following error:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
Try reinstalling the kernel header files and restarting.
5. Kernel header file installation and GCC configuration
If you encounter kernel header file or GCC version problems, follow these steps:
-
Reinstall the kernel header file:
sudo apt install --reinstall linux-headers-$(uname -r) sudo reboot
-
Update GCC version:
If you encounter a GCC error during the kernel header file installation process, you can upgrade to
gcc-12
:sudo apt-get install gcc-12 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 12
After reinstalling the kernel header file and restarting the server,
nvidia-smi
should be able to work normally.
Enjoy Reading This Article?
Here are some more articles you might like to read next: