Reference Websites
Maven Fundamentals - Java Tutorial - Liao Xuefeng’s Official Website
Maven Fundamentals
Introduction to Maven
Before understanding Maven, let’s look at what a Java project needs. First, we need to determine which dependency packages to introduce. For example, if we need to use commons logging, we must put the commons logging jar package into the classpath. If we also need log4j, we need to put all log4j-related jar packages into the classpath. This is dependency management.
Secondly, we must determine the directory structure of the project. For example, the src directory stores Java source code, the resources directory stores configuration files, and the bin directory stores the compiled .class files.
Furthermore, we also need to configure the environment, such as the JDK version, the compilation and packaging process, and the version number of the current code.
Finally, in addition to using IDEs like Eclipse for compilation, we must also be able to compile via command-line tools so that the project can be compiled, tested, and deployed on an independent server.
These tasks are not difficult, but they are very tedious and time-consuming. If every project had its own set of configurations, it would certainly be a mess. What we need is a standardized Java project management and build tool.
Maven is a management and build tool specifically created for Java projects. Its main features include:
- Providing a standardized project structure;
- Providing a standardized build process (compilation, testing, packaging, publishing…);
- Providing a dependency management mechanism.
Maven Project Structure
A typical Java project managed using Maven has the following default directory structure:
| |
The root directory of the project a-maven-project is the project name. It has a project description file pom.xml. The directory storing Java source code is src/main/java, the directory storing resource files is src/main/resources, the directory storing test source code is src/test/java, and the directory storing test resources is src/test/resources. Finally, all files generated by compilation and packaging are placed in the target directory. These constitute the standard directory structure of a Maven project.
All directory structures are agreed-upon standard structures, and we must never modify the directory structure arbitrarily. Using the standard structure requires no configuration, and Maven can be utilized normally.
Let’s look at the most critical project description file, pom.xml. Its content looks like the following:
| |
Here, groupId is similar to Java’s package name, typically the name of a company or organization. artifactId is similar to a Java class name, typically the project name. Together with version, a Maven project is uniquely identified by groupId, artifactId, and version.
When we reference other third-party libraries, it is also determined through these 3 variables. For example, depending on org.slfj4:slf4j-simple:2.0.16:
| |
After declaring a dependency using <dependency>, Maven will automatically download this dependency package and put it into the classpath.
Additionally, note that <properties> defines some properties. Commonly used properties are:
project.build.sourceEncoding: Indicates the character encoding of the project source code, typically set toUTF-8;maven.compiler.release: Indicates the JDK version to use, for example,21;maven.compiler.source: Indicates the source code version read by the Java compiler;maven.compiler.target: Indicates the Class version compiled by the Java compiler.
Starting from Java 9, it is recommended to use the maven.compiler.release property to ensure that the input source code and the compiled output version are consistent during compilation. If the source code and output versions are different, maven.compiler.source and maven.compiler.target should be set respectively.
By defining properties via <properties>, the JDK version can be fixed, preventing different developers of the same project from using different versions of the JDK.
Summary
Maven is a management and build tool for Java projects:
- Maven uses
pom.xmlto define project content and utilizes a preset directory structure; - Declaring a dependency in Maven can automatically download and import it into the classpath;
- Maven uses
groupId,artifactId, andversionto uniquely locate a dependency.
Dependency Management
If our project depends on third-party jar packages, such as commons logging, the question arises: where do we download the published jar package for commons logging?
If we also want to depend on log4j, what jar packages are needed to use log4j?
Similar dependencies include: JUnit, JavaMail, MySQL driver, etc. A feasible method is to search for the project’s official website via a search engine, manually download the zip package, extract it, and put it into the classpath. However, this process is very tedious.
Maven solves the dependency management problem. For example, our project depends on the jar package abc, and abc depends on the jar package xyz:
| |
When we declare the dependency for abc, Maven automatically adds both abc and xyz to our project dependencies. We don’t need to manually investigate whether abc requires xyz.
Therefore, the first role of Maven is to resolve dependency management. We declare that our project needs abc, Maven will automatically import the jar package of abc, then determine that abc needs xyz, and will automatically import the jar package of xyz. Thus, ultimately, our project will depend on both abc and xyz jar packages.
Let’s look at a complex dependency example:
| |
When we declare a spring-boot-starter-web dependency, Maven will automatically parse and determine that approximately twenty or thirty other dependencies are ultimately required:
| |
If we try to manually manage these dependencies ourselves, it is extremely time-consuming, laborious, and the probability of errors is very high.
Dependency Relationships
Maven defines several dependency relationships, namely compile, test, runtime, and provided:
| scope | Description | Example |
|---|---|---|
| compile | This jar package is needed during compilation (default) | commons-logging |
| test | This jar package is needed when compiling tests | junit |
| runtime | Not needed during compilation, but required at runtime | mysql |
| provided | Needed during compilation, but provided by JDK or a server at runtime | servlet-api |
Among them, the default compile is the most commonly used, and Maven will place dependencies of this type directly into the classpath.
test dependencies indicate they are only used during testing and are not needed during normal execution. The most common test dependency is JUnit:
| |
runtime dependencies indicate they are not needed during compilation but are required at runtime. The most typical runtime dependencies are JDBC drivers, such as the MySQL driver:
| |
provided dependencies indicate they are needed during compilation but not at runtime. The most typical provided dependency is the Servlet API, which is needed during compilation; however, at runtime, the Servlet server has built-in related jars, so they are not needed during the execution phase:
| |
The last question is, how does Maven know where to download the required dependencies? That is, the related jar packages? The answer is that Maven maintains a central repository (repo1.maven.org), where all third-party libraries upload their own jars and related information. Maven can download the required dependencies from the central repository to the local machine.
Maven does not download jar packages from the central repository every time. Once a jar package has been downloaded, it is automatically cached by Maven in a local directory (the .m2 directory in the user’s home directory). Therefore, apart from the first compilation being relatively slow due to the time needed for downloading, subsequent processes will not repeatedly download the same jar packages because of the local cache.
Unique ID
For any given dependency, Maven only needs 3 variables to uniquely identify a jar package:
- groupId: The name of the organization it belongs to, similar to a Java package name;
- artifactId: The name of the jar package itself, similar to a Java class name;
- version: The version of the jar package.
Through the above 3 variables, a certain jar package can be uniquely determined. Maven ensures that any jar package cannot be modified once published by performing PGP signing on the jar packages. The only way to modify a published jar package is to publish a new version.
Therefore, once a jar package has been downloaded by Maven, it can be permanently and safely cached locally.
Note: Only version numbers ending with -SNAPSHOT are regarded by Maven as development versions. Development versions are repeatedly downloaded every time. Such SNAPSHOT versions can only be used in internal private Maven repos, and publicly published versions are not allowed to be SNAPSHOTs.
Henceforth, when we represent Maven dependencies, we use the abbreviated form groupId:artifactId:version, for example: org.slf4j:slf4j-api:2.0.4.
Maven Mirrors
Besides downloading from Maven’s central repository, you can also download from Maven mirror repositories. If accessing Maven’s central repository is very slow, we can choose a faster Maven mirror repository. Maven mirror repositories synchronize periodically from the central repository:
| |
Users in the China region can use the Maven mirror repository provided by Alibaba Cloud. Using a Maven mirror repository requires configuration. In the user’s home directory, enter the .m2 directory and create a settings.xml configuration file with the following content:
| |
After configuring the mirror repository, Maven’s downloading speed will be very fast.
Searching for Third-Party Components
The final question: if we want to reference a third-party component, such as okhttp, how do we exactly acquire its groupId, artifactId, and version? The method is to search for keywords via search.maven.org. After finding the corresponding component, directly copy it.
Command Line Compilation
In the command line, navigate to the directory where pom.xml is located, and enter the following command:
| |
If everything goes smoothly, you can obtain the automatically packaged jar after compilation in the target directory.
Using Maven in an IDE
Almost all IDEs have built-in support for Maven. In Eclipse, you can directly create or import a Maven project. If the imported Maven project has errors, you can try selecting the project, right-clicking, and choosing Maven - Update Project… to update it.
Summary
Maven determines the jar packages required by the project through parsing dependency relationships. The 4 commonly used scopes are: compile (default), test, runtime, and provided;
Maven downloads the required jar packages from the central repository and caches them locally;
Downloading can be accelerated through mirror repositories.
Build Process
Maven not only has a standardized project structure, but it also has a standardized build process that can automatically automate compiling, packaging, publishing, and more.
Lifecycle and Phase
When using Maven, we first need to understand what Maven’s lifecycle is.
Maven’s lifecycle consists of a series of phases. Taking the built-in lifecycle default as an example, it includes the following phases:
- validate
- initialize
- generate-sources
- process-sources
- generate-resources
- process-resources
- compile
- process-classes
- generate-test-sources
- process-test-sources
- generate-test-resources
- process-test-resources
- test-compile
- process-test-classes
- test
- prepare-package
- package
- pre-integration-test
- integration-test
- post-integration-test
- verify
- install
- deploy
If we run mvn package, Maven will execute the default lifecycle, and it will run consistently from the beginning up until the package phase:
- validate
- initialize
- …
- prepare-package
- package
If we run mvn compile, Maven will also execute the default lifecycle, but this time it will only run up to compile, namely the following phases:
- validate
- initialize
- …
- process-resources
- compile
Another commonly used lifecycle in Maven is clean, which executes 3 phases:
- pre-clean
- clean (note that this clean is a phase, not a lifecycle)
- post-clean
Therefore, when we use the mvn command, the parameter following it is a phase, and Maven automatically runs up to the specified phase according to the lifecycle.
A more complex example is specifying multiple phases. For example, running mvn clean package, Maven first executes the clean lifecycle and runs up to the clean phase, then it executes the default lifecycle and runs up to the package phase. The actually executed phases are as follows:
- pre-clean
- clean (note that this clean is a phase)
- validate (starts executing the first phase of the default lifecycle)
- initialize
- …
- prepare-package
- package
During the actual development process, frequently used commands include:
mvn clean: Cleans up all generated classes and jars;
mvn clean compile: Cleans first, then executes up to compile;
mvn clean test: Cleans first, then executes up to test. Because compile must be executed before executing test, there is no need to specify compile here;
mvn clean package: Cleans first, then executes up to package.
During the execution process of most phases, because we usually do not configure related settings in pom.xml, these phases effectively do nothing.
The phases that are frequently used are actually only a few:
- clean: clean up
- compile: compile
- test: run tests
- package: package
Goal
Executing a phase subsequently triggers one or multiple goals:
| Executed Phase | Corresponding Executed Goal |
|---|---|
| compile | compiler:compile |
| test | compiler:testCompile surefire:test |
The naming of a goal always takes the format of abc:xyz.
Actually, if we draw an analogy, it becomes clear:
- lifecycle is equivalent to a Java package; it contains one or multiple phases;
- phase is equivalent to a Java class; it contains one or multiple goals;
- goal is equivalent to a class method; it is actually the one doing the real work.
In most cases, we simply specify the phase, and it defaults to executing the goals bound by default to these phases. Only in a few instances do we directly specify running a goal, for example, starting a Tomcat server:
| |
Summary
Maven provides a standard build process through lifecycles, phases, and goals.
The most commonly used build command entails specifying a phase, subsequently allowing Maven to execute up to the designated phase:
- mvn clean
- mvn clean compile
- mvn clean test
- mvn clean package
Normally, we always execute the goals natively bound by default to the phase, so it is unnecessary to specify the goal.