Quantcast
Channel: Samuel Schmitt - Senior IT Consultant - Basel » Architecture
Viewing all articles
Browse latest Browse all 2

Step by Step Magnolia CMS Clustering

$
0
0

In this article I will explain how to cluster two Magnolia public instances in order to share all the content between all the workspaces.
The author instance will activate the content only to one public.

 

Both public instances will persist their content on the same mySQL database.

 

Author will keep the default Derby database.

 

The diagram below represents what we will achieve:

 

What do you need for this tutorial

  • A good Magnolia knowledge
  • A Magnolia bundle (I use 4.5.10 EE but it works with any new version of Magnolia )
  • MySQL installed

Step by Step Clustering

Preparation of the environment

Unzip Magnolia bundle and duplicate the public instance magnoliaPublic, name it magnoliaPublic2.
Adapt the configuration of magnoliaPublic2 (in WEB-INF/config).

 

For the tutorial, I simply duplicated WEB-INF/config/magnoliaPublic and named it magnoliaPublic2. Then public 1 and public 2 have the same configuration.

 

Create a database in your mySQL DB, name it magnolia_public.

 

And finally create somewhere on your file system a folder called shared_data.

 

NOTE: Of course, you understand that this environment if just for local testing, don’t do exactly the same in production, do it better.

 

Update the configuration of the public instances

First in the magnolia.properties file (of both publics), update the following line:

magnolia.repositories.jackrabbit.config=WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search.xml

It points to the mysql repository configuration.

 

Now, let’s change this file.

But wait a sec, maybe you should understand a bit what we try to do.

Have a look at this page and come back here.

 

Now you understand that both publics need access to the same persistent storage, in other words there is only one mySQL DB for both publics.

 

And also we have to meet all the requirements described (I paste them again in case you didn’t read the page):

  • Each cluster node must have its own repository configuration.
  • A DataStore must always be shared between nodes, if used.
  • The global FileSystem on the repository level must be shared (only the one that is on the same level as the data store; only in the repository.xml file).
  • Each cluster node needs its own (private) workspace level and version FileSystem (only those within the workspace and versioning configuration; the ones in the repository.xml and workspace.xml file).
  • Each cluster node needs its own (private) Search indexes.
  • Every cluster node must be assigned a unique ID.
  • A journal type must be chosen, either based on files or stored in a database.
  • Each cluster node must use the same (shared) journal.
  • The persistence managers must store their data in the same, globally accessible location

 

Now, we can finally change the file.

 

Open WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search.xml and let’s edit it from top to bottom.
(You will have to make these changes on both public instances configuration.)

 

First change, add a Cluster node:

<Cluster id="cid_pub1" syncDelay="2000">
<Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
<param name="revision" value="${rep.home}/revision.log" />
<param name="driver" value="com.mysql.jdbc.Driver" />
<param name="url" value="jdbc:mysql://localhost:3306/magnolia_public" />
<param name="user" value="root" />
<param name="password" value="" />
<param name="schema" value="mysql" />
<param name="schemaObjectPrefix" value="journal_" />
</Journal>
</Cluster>

Cluster id must be unique per instance (use cid_pub2 for public 2).
Revision path must remain on the private repository on the filesystem.
In url set the URL of the database. (If you have the standard setup you should have the same than in the example).

 

Next stop, DataSource.
Point to the same DB than in the cluster node.

  <DataSources>
    <DataSource name="magnolia">
      <param name="driver" value="com.mysql.jdbc.Driver" />
      <param name="url" value="jdbc:mysql://localhost:3306/magnolia_public" />
      <param name="user" value="root" />
      <param name="password" value="" />
      <param name="databaseType" value="mysql"/>
      <param name="validationQuery" value="select 1"/>
    </DataSource>
  </DataSources>

 

And, Filesystem and Datastore must use the shared folder.

<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="/path_to/shared_data/repository" />
</FileSystem>

.....

<DataStore class="org.apache.jackrabbit.core.data.FileDataStore">
<param name="path" value="/path_to/shared_data/repository/datastore"/>
<param name="minRecordLength" value="1024"/>
</DataStore>

That’s it, the rest doesn’t change.

 

I repeat my self, but make these changes in the configuration of public 1 and public 2.

Start the installation and configure a single subscriber

First, only start the author instance and one public instance.

 

Once the installation of the public is done, you can start the other public (which won’t run an installation because everything was installed by the other public).

 

Finish with the configuration of one single subscriber on the author.

 

You can now create a new page and publish it. Check on both publics, and yes the page is available on both instances.

Advantages of clustering

For CE projects, as you can only setup one subscriber, thanks to clustering you can setup a load balancing by having multiple public instances to share the workload.

 

For EE projects, when your website is visited by more than a million of visitors per day and generates a lot of traffic, it’s highly possible than you need more than 10 public instances to manage all the load.
It’s not always necessary to maintain that many databases (1 per instance). Keep few databases and cluster 4 – 5 public instances together could be a good approach.

 

More resources about Magnolia CMS and Clustering

I list all the links that were useful to achieve this tutorial:


Viewing all articles
Browse latest Browse all 2

Trending Articles