Deploying a Linux Server for Academic Research

Recently, I repurposed a MiniPC into a server running the Ubuntu operating system, configuring it to meet potential future research needs.

Given my limited knowledge of Linux, I spent some time learning and attempting to complete the deployment, documenting the configuration steps, potential issues, and their corresponding solutions.

Creating a New User with Administrator Privileges on Ubuntu

When using a cloud service, a vendor-provided, or a self-configured Ubuntu server, a default user is created. Typically, this user can temporarily elevate privileges using the sudo command to perform tasks requiring administrator rights, but it remains essentially a regular user (not an administrator account). If you need to create a custom user with the same administrator privileges, follow these steps:

sudo useradd -d "/home/<user_name>" -m -s "/bin/bash" <user_name>

Parameter Description:

-d "/home/<user_name>": Sets the user’s home directory to /home/<user_name>.
-m: Automatically creates the home directory.
-s "/bin/bash": Specifies the user’s default login shell as /bin/bash.

2. Grant Administrator Privileges to the New User

To create a user with sudo (administrator) privileges, run the following command:

sudo useradd -d "/home/<user_name>" -m -s "/bin/bash" -G sudo <user_name>

Where:

-G sudo: Adds <user_name> to the sudo group, granting it administrator privileges.

3. Set the New User’s Password

The newly created user has no password by default. Set a password for <user_name> using the following command:

sudo passwd <user_name>

After running this command, the system will prompt you to enter the password twice. Note that for security reasons, no characters will be displayed when entering the password (no text or prompts will appear). Simply enter the password and press Enter to confirm.

By following the above steps, you can successfully create a new user <user_name> with administrator privileges, who can execute commands using sudo.

Terminal Beautification

A good-looking and easy-to-use terminal prompt can increase work pleasure. Here, zsh is used to beautify the terminal. Refer to previous records for related operations.

Enabling Remote Access to the Server

To enable remote access to the Ubuntu server from a non-physical location, you can install and configure the SSH service and perform the necessary firewall configurations.

1. Install and Configure SSH Service

If only basic SSH access is required, installing openssh-server and enabling the SSH service is sufficient. However, in environments with higher security requirements, you can further optimize the SSH configuration file /etc/ssh/sshd_config:

sudo apt update
sudo apt install openssh-server

After completing the installation, check the status of the SSH service to ensure it is running normally:

sudo systemctl status ssh

Configuration Suggestions:
- Disable Root User Direct Login (Recommended): Avoid logging into the server directly as root to increase security. Find PermitRootLogin in /etc/ssh/sshd_config and set it to no.
```
sudo nano /etc/ssh/sshd_config
```
```
PermitRootLogin no
```
- Limit Allowed Users: You can specify users allowed to log in via SSH through the AllowUsers configuration item, further enhancing security.
```
AllowUsers <user_name>
```
- Use a Non-Default Port (Optional): Change the SSH port from the default 22 to another port, such as 2200. This method can reduce the possibility of scanning attacks, but you need to update the firewall rules at the same time.
```
Port 2200
```
- Enable Automatic Disconnection of Idle Connections: To avoid unused SSH sessions from occupying resources for a long time, you can add the following lines to the configuration to set automatic disconnection of idle connections.
```
ClientAliveInterval 300
ClientAliveCountMax 2
```

After making the changes, restart the SSH service to apply the configuration:

sudo systemctl restart ssh

2. Configure UFW Firewall

If the UFW firewall is enabled, make sure to open the SSH port and make corresponding settings depending on whether the port number has been changed. If you are using the default port 22, you can directly use:

sudo ufw allow ssh

If you have changed the port number, such as setting it to 2200, you need the following command:

sudo ufw allow 2200/tcp

Configuration Suggestions:
- After enabling the firewall, check the status to ensure the rules are applied correctly:
```
sudo ufw enable
sudo ufw status
```

3. Verify SSH Connection

Test the connection to the server on the client system (such as Windows). It is recommended to use terminal tools that support the SSH protocol, such as PuTTY or Windows Terminal in Windows, to ensure the security and stability of the connection.

To test the remote connection to the server on a Windows system, you can use the telnet command (note: telnet is often used to test connections, but a more secure SSH client should be used in a production environment):

telnet <remote_ip> <remote_port>

Replace <remote_ip> with the IP address of the server, and <remote_port> with the SSH port the server is listening on (default is 22).

4. Fix .Xauthority File Permission Issues

Incorrect permissions for the /home/<user_name>/.Xauthority file may be due to the file being created without root privileges. In addition to modifying the ownership of the user directory, ensure that the permissions of the relevant SSH session directory are also correct:

sudo chown <user_name>:<user_name> -R /home/<user_name>

If the problem persists, you can try creating a new .Xauthority file:

sudo -u <user_name> touch /home/<user_name>/.Xauthority
sudo chown <user_name>:<user_name> /home/<user_name>/.Xauthority

5. Set Up Fail2Ban (Recommended)

To further protect the SSH service from brute-force attacks, you can install and configure fail2ban. This tool automatically detects multiple failed login attempts and temporarily disables the corresponding IP:

sudo apt install fail2ban

fail2ban will automatically enable SSH protection. You can also customize the /etc/fail2ban/jail.local file to adjust parameters such as the ban time and number of retries:

[sshd]
enabled = true
port = 22
maxretry = 5
bantime = 600

Fail2Ban will automatically identify multiple failed login attempts and disable the corresponding IP, further ensuring server security.

Configuring SSH Connection between Server and GitHub

The following are detailed steps to configure an SSH connection between an Ubuntu server and GitHub, ensuring you can securely clone, push, and pull repositories on GitHub.

1. Install and Verify Git

First, install Git and confirm the installed version:

sudo apt install git
git --version

2. Configure Git User Information

Configure Git with your GitHub username and email. Make sure to fill in the information consistent with your GitHub account so that the author identity is correctly recorded when submitting code:

git config --global user.name "<github_account_name>"
git config --global user.email "<github_account_email>"

The above configuration will be added to the ~/.gitconfig file, which is a global setting, that is, applied to all Git repositories under this user.

3. Generate SSH Key Pair

To establish a secure connection with GitHub on the server, you need to generate an SSH key pair:

ssh-keygen -C "<github_account_email>" -t rsa

Description:
- -C "<github_account_email>": Add a comment to the key, usually the email address of the GitHub account.
- -t rsa: Specifies the key type as RSA (a commonly used type supported by GitHub).

After running the command, press Enter three times (that is, keep the default file name id_rsa and the default no password setting). The key pair will be stored in the ~/.ssh directory.

4. Add SSH Public Key to GitHub

Use the following command to open the generated public key file and copy its contents:
```
cat ~/.ssh/id_rsa.pub
```
This command will print the public key to the terminal. You can optionally open and copy it using a text editor (for example, using vim ~/.ssh/id_rsa.pub).
Log in to the GitHub website and navigate to Settings → SSH and GPG keys → New SSH key.
Paste the content from id_rsa.pub into the New SSH key page, and set a descriptive name for this key (such as Ubuntu Server Key), and then save it.

5. Test SSH Connection with GitHub

After the configuration is complete, test the connection with GitHub using the following command:

ssh -T git@github.com

When executing this command, GitHub will return a message confirming the connection is successful, for example:

Hi <github_account_name>! You've successfully authenticated, but GitHub does not provide shell access.

This information indicates that the SSH connection has been successfully established, and you can perform code push and pull operations with GitHub on the server.

6. Common Issues and Solutions

SSH Key Permission Issues: Ensure that the permissions of the SSH key pair files are correct to prevent connection problems. Check and set the permissions of the key:
```
chmod 600 ~/.ssh/id_rsa
chmod 644 ~/.ssh/id_rsa.pub
```
Add Key to SSH Agent (Recommended): If the key is not automatically loaded after the server starts, you can add it to the SSH Agent so that it is automatically loaded after each restart:
```
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa
```

Python Environment Configuration and Management

Miniforge

For managing Python Scientific environments on the server, I chose lightweight and efficient Miniforge as the package management tool, replacing Anaconda. Miniforge defaults to the conda-forge channel and integrates Mamba, providing a faster solution for package management. Below are the steps to install and configure Miniforge and create and delete environments.

1. Install Miniforge

First, follow the installation instructions in the Miniforge GitHub project page to download and install. The following are the core installation commands:

wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

Configuration Suggestions:
- It is recommended to install Miniforge in /usr/local/miniforge3, so that multiple users can share the environment, but only the root user can modify it. During the installation process, you can choose the installation directory, and the system will automatically create the required folders.
- If you are using ZSH, you should confirm that the Miniforge path is added to the .zshrc file.
```
export PATH="$usr/local/miniforge3/bin:$PATH"
```
- Reload the .zshrc configuration: source ~/.zshrc

2. Initialize Mamba Environment

After the installation is complete, initialize Mamba so that the command can be used in the terminal. Assuming Miniforge is installed in /usr/local/miniforge3, execute the following command:

/usr/local/miniforge3/bin/mamba init

This command will initialize the environment variables, and you need to restart the terminal to apply the changes.

If it does not work with ZSH, you can copy the mamba configuration from .bashrc to .zshrc, or configure it in other ways.

3. Create and Manage Environments

In a multi-user server environment, it is recommended to create and manage environments with root privileges (switched via sudo su). This approach avoids ordinary users from accidentally or uncontrollably changing the environment configuration.

Create a New Environment

# Create a new environment with the specified name
mamba create --name <new_env_name> python=3.11 --no-default-packages

--name <new_env_name>: Name the new environment.
python=3.11: Set the Python version to 3.11.
--no-default-packages: Only install Python, do not automatically install other packages.
Ordinary users can create private environments in their home directories (for example, /home/username/myenv), execute the following command:

mamba create --prefix /home/username/myenv python=3.11 --no-default-packages

Delete Environment

To delete an environment (including all associated packages), you can use the following command:

# Delete by environment name
mamba remove --name <env_name> --all
# Delete by path
mamba remove --prefix /path/to/directory --all

4. Install Common Packages

It is usually recommended not to directly modify the base environment, but to create a dedicated environment according to actual needs. For example, you may need to install JupyterHub, which provides Jupyter notebook services for multiple users.

# Create Machine Learning environment
mamba create --name ml_env python=3.12 jupyterhub jupyterlab notebook notebook scipy numpy numpy
# Activate environment
mamba activate ml_env
# Download other packages
mamba install jupyter-lsp-python jupyterlab-lsp jupyterlab-git jupyterlab_execute_time

5. Prevent Unauthorized Updates

In a multi-user environment, restricting ordinary users from modifying system-level environments helps maintain the stability of the environment. If a user tries to update the environment, they will encounter an insufficient permissions error. An example is as follows:

mamba update --all

Error message:

EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
  environment location: /usr/local/miniforge3
  uid: 1000
  gid: 1000

This design ensures the security and consistency of the environment and avoids update failures due to permission issues.

If you need to temporarily give permissions to update the base environment, you can use the sudo command:

sudo /usr/local/miniforge3/bin/mamba update --all

Poetry

Poetry is an efficient and convenient Python project dependency management tool, suitable for quickly creating and managing virtual environments, installing dependency libraries, and publishing Python packages.

1. Install Poetry

Before installing Poetry, you should ensure that Python 3.7 or above is installed.

Use the official installation script:

The Poetry installation script can automatically install Poetry in the $HOME/.local/bin directory. Run the following command to download and execute the installation script:
```
curl -sSL https://install.python-poetry.org | python3 -
```
Add Poetry to PATH:

By default, after the installation is complete, you need to add Poetry to the environment variables. If a command not found error occurs, you can add it to the environment variables of the current session using the following command:
```
export PATH="$HOME/.local/bin:$PATH"
```
Verify the installation:

After the installation is complete, you can check the installation version through the following command to confirm whether the installation is successful:
```
poetry --version
```

2. Create a New Project

Poetry provides simplified commands to quickly generate the basic structure of a new project.

Create a new project:

Use the following command to create a new project directory (for example, my_project) and generate the default pyproject.toml file.
```
poetry new my_project
```
This command will generate the following structure in the project directory:
```
my_project/
├── my_project/
│   └── __init__.py
├── pyproject.toml
└── tests/
    └── __init__.py
```
Initialize an existing project (optional):

If a project already exists and you want to use Poetry for management, you can initialize the project through poetry init. This command will guide the generation of the pyproject.toml file and configure initial dependencies:
```
cd existing_project
poetry init
```

3. Manage Project Dependencies

Poetry provides a convenient dependency management method, distinguishing between production dependencies and development dependencies.

Add production dependencies:

Add dependencies to the production environment, such as the requests library:
```
poetry add requests
```
Add development dependencies:

If some libraries are only used for development and testing environments, you can use the --dev parameter. This parameter adds the dependency to the [tool.poetry.dev-dependencies] section. For example, add pytest as a development dependency:
```
poetry add pytest --dev
```
Install all dependencies:

After the project dependencies are written to the pyproject.toml file, you can use the following command to install all dependencies:
```
poetry install
```
poetry install will automatically create a virtual environment and install the required dependencies in the virtual environment. If the poetry.lock file already exists, it will ensure that the installed dependency version is consistent with the version in the lock file to ensure environment consistency.

4. Manage Virtual Environments

Poetry creates virtual environments outside the project directory and automatically activates and uses the virtual environment.

Activate the virtual environment:

You can activate the virtual environment created by Poetry through the following command:
```
poetry shell
```
Exit the virtual environment:

After finishing work in the virtual environment, enter exit to exit.
View the virtual environment path:

If you need to view the actual storage path of the virtual environment, you can use:
```
poetry env info --path
```
Delete the virtual environment (optional):

If you need to recreate the virtual environment or clean up the environment, you can delete the virtual environment:
```
poetry env remove python
```

5. Manage Dependency Lock Files

Poetry uses the poetry.lock file to lock the exact version of the dependency to ensure consistency across environments.

Update dependency version:

When you need to update the dependency version, you can use the following command to re-parse the dependencies and update the lock file:
```
poetry update
```
Install the specified lock version:

In collaborative projects, team members can install the exact dependency version of the project based on the poetry.lock file:
```
poetry install
```

6. Run Scripts and Commands

Poetry supports running scripts or commands directly in the virtual environment, simplifying command management.

Run project script:

Use poetry run to execute commands in the virtual environment. For example, execute a Python script:
```
poetry run python script.py
```
Run unit tests directly:

You can directly run test commands in the virtual environment, such as pytest:
```
poetry run pytest
```

7. Publish Python Packages

Poetry can publish projects to PyPI or other custom package repositories.

Build the project:

Poetry provides a one-click function to build the project, package the project into .whl and .tar.gz files, and prepare for publishing:
```
poetry build
```
Publish to PyPI:

To publish the package to PyPI, you need to configure PyPI credentials in the ~/.pypirc file, or use Poetry’s publish command to enter interactively:
```
poetry publish --build
```
Note: To verify the publishing process in the test environment, you can use the --repository parameter to publish to the PyPI test repository.
```
poetry publish --repository testpypi
```

The following is a configuration guide for using the R environment in the field of econometrics, including the installation of R and RStudio, commonly used R package configurations, etc.

Configuring R Environment on Ubuntu for Econometric Analysis

In the field of econometrics, R is suitable for processing economic data, performing regression analysis, time series analysis, and other tasks.

1. Install R

Ubuntu’s default software repository contains R, but it may not be the latest version. To get the latest version of R, you can use the CRAN repository.

Add CRAN repository:

Update the package list and install the necessary dependencies:
```
sudo apt update
sudo apt install software-properties-common dirmngr -y
```

Add GPG key for R project:

Download and add CRAN’s GPG public key to ensure the integrity of the package:

wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | sudo tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc

Verify GPG key (optional):

Verify the fingerprint of the key (E298A3A825C0D65DFD57CBB651716619E084DAB9):
```
gpg --show-keys /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
```
Add CRAN repository to source list:

Add the CRAN repository to the apt source list to ensure you get the latest version of R:
```
sudo add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
```
Install R and development packages:

Install the latest version of R and development libraries:
```
sudo apt install r-base r-base-dev -y
```

2. Install RStudio

RStudio is a powerful IDE suitable for data analysis and visualization. The installation process of RStudio is simple. For specific installation steps, please refer to RStudio official installation guide.

Installation steps:

Download the latest version of RStudio Server.

Use the dpkg command to install:

sudo dpkg -i rstudio-server-<version>.deb

Check the installation status:
```
sudo systemctl status rstudio-server
```

After RStudio Server is successfully installed, you can access it in your browser via http://<your-server-ip>:8787.

Econometric analysis usually requires specialized data processing, regression analysis, and time series analysis packages.

Install system dependencies:

Install system-level development packages to ensure smooth installation of R packages (especially for some packages that need to be compiled, such as tidyverse and data.table):
```
sudo apt-get install build-essential libssl-dev libcurl4-openssl-dev libxml2-dev
```
Install econometrics and data processing packages:

Start the R console and install commonly used econometrics packages:
```
sudo R
```
Execute the following commands in the R console:
```
chooseCRANmirror(graphics = FALSE)
install.packages(c("tidyverse", "data.table", "broom", "plm", "forecast", "lmtest", "sandwich", "stargazer"))
```
- tidyverse: Includes data processing and visualization packages such as dplyr, ggplot2, and tidyr.
- data.table: Used for fast data processing.
- broom: Organize regression analysis results into easy-to-analyze tables.
- plm: Used for panel data regression analysis.
- forecast: Used for time series forecasting and analysis.
- lmtest and sandwich: Provide econometric test tools and robust standard errors.
- stargazer: Used to output regression results into easy-to-understand tables, suitable for papers or reports.
Install advanced econometrics tools:
- AER (Applied Econometrics with R): Contains commonly used functions and data sets for economic research.
- urca: Provides unit root and cointegration tests.
- vars: Used for vector autoregressive (VAR) analysis.
```
install.packages(c("AER", "urca", "vars"))
```
Install financial time series analysis packages (optional):
- quantmod and TTR: Used for financial market data analysis and technical indicator calculation.
- zoo and xts: Process irregular time series data.
```
install.packages(c("quantmod", "TTR", "zoo", "xts"))
```

4. Install Private Packages Using GitHub Token

If you need to install some experimental or custom packages from GitHub, it is recommended to use a GitHub token to avoid API rate limits.

Generate GitHub token:

Generate a token in the R console:
```
usethis::create_github_token()
```
Generate a new Personal Access Token on the GitHub website (path: Settings → Developer settings → Personal access token → Tokens (classic)).
Configure GitHub token:

Add the generated token to the R environment:
```
gitcreds::gitcreds_set()
```

Under this setting, you can safely install private packages from GitHub and avoid API restrictions.

5. Example: Install and Use the `plm` Package for Panel Data Regression

Assuming you want to use panel data for regression analysis, here is an example of how to install plm and execute a fixed effects model:

Install plm (if not installed):
```
install.packages("plm")
```

Load and use plm:

library(plm)

# Create sample panel data set
data("Produc", package = "plm")
pdata <- pdata.frame(Produc, index = c("state", "year"))

# Fixed effects model regression
fe_model <- plm(log(gsp) ~ log(pcap) + log(hwy) + log(water) + log(util), data = pdata, model = "within")
summary(fe_model)

The following is a guide to organize and optimize to help synchronize data, install fonts, and deal with Nvidia drivers and library-related issues more effectively.

Other issues

1. Synchronize data

To synchronize local data to a remote server, it is recommended to use rsync, which is an efficient and reliable synchronization method. More details should be referred to this tutorial.

Synchronization command:
```
rsync -r /path/to/sync/ <username>@<remote_host>:<destination_directory>
```
The above command will “push” all the contents of the local directory /path/to/sync/ to the <destination_directory> of the remote server.
Monitor the transmission progress of large files:

If you are synchronizing large files, you can use the watch command to monitor the synchronization progress:
```
watch -n <time_interval> du -sh /path/to/large/file
```
This command will refresh the file size change every <time_interval> seconds.

2. Install common fonts

The Linux system does not come with some common fonts in Windows by default (such as Arial and Times New Roman). Installing these fonts can improve the display effect of documents and websites, especially those that rely on these fonts for graphic output. Install the Microsoft TrueType core font package and refresh the cache:

sudo apt install msttcorefonts
rm -rf ~/.cache/matplotlib

msttcorefonts includes a variety of Microsoft fonts, such as Arial and Times New Roman.
The second command deletes the matplotlib cache directory to ensure that the updated fonts are loaded correctly.

3. Driver/library version mismatch

When running nvidia-smi, if the following error occurs:

Failed to initialize NVML: Driver/library version mismatch

You can refer to the solution on Stack Overflow. The brief steps are as follows:

Restart the server: In some cases, restarting the server can solve the problem:
```
sudo reboot
```
Uninstall and reinstall Nvidia driver:

If restarting is invalid, try the following command to clear the existing Nvidia driver and reinstall it:
```
sudo apt purge nvidia* libnvidia*
sudo ubuntu-drivers install
sudo reboot
```

4. Upgrade Nvidia driver

To upgrade the Nvidia driver, you can follow these steps:

Uninstall the old driver:

sudo apt purge *nvidia* -y
sudo apt remove *nvidia* -y
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt autoremove -y && sudo apt autoclean -y
sudo rm -rf /usr/local/cuda*

Find and install the recommended driver:

Run the following command to find the recommended Nvidia driver version:
```
ubuntu-drivers devices
```
Or install the specified version through the following command (for example, 550, adjust the version number according to system requirements):
```
sudo apt install libnvidia-common-550-server libnvidia-gl-550-server nvidia-driver-550-server -y
```
Restart and check:

Restart the server and use nvidia-smi to check whether the new driver is running normally:
```
sudo reboot now
```
If nvidia-smi returns the following error:
```
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
```
Try reinstalling the kernel header files and restarting.

5. Kernel header file installation and GCC configuration

If you encounter kernel header file or GCC version problems, follow these steps:

Reinstall the kernel header file:

sudo apt install --reinstall linux-headers-$(uname -r)
sudo reboot

Update GCC version:

If you encounter a GCC error during the kernel header file installation process, you can upgrade to gcc-12:
```
sudo apt-get install gcc-12
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 12
```
After reinstalling the kernel header file and restarting the server, nvidia-smi should be able to work normally.