Tutorial
- Multiple Sequence Alignment (MSA) with ClustalW2 and Muscle
The final workflow after the execution and some results.
5 steps required
Step 1. Open a new
workflow project
Open the
File Menu and select
New.
When asked if you want to migrate your data and workflow to the new
project, select
NO.
Then, enter a new
project
filename.
For the time being, we will be working with the default name.
The new project
file (New_Untitled.db) will be in the \projects\ directory.
Step 2. Insert a new
sequence dataset
Go into the
File Menu and select
the
Import Sequences menu
.
Select the file
hiv.fasta
located in
the
/
examples
directory.
A new dialog box will be shown where you can enter some informations
about the sequence files, type of sequences (DNA, RNA or AA [Amino Acid]).
Select
import when you''ve finished entering informations.
Note: The name prefix will be added to all files. It allows for a simplified loading of multiple sequence alignment files.
Once imported, you can find the newly imported
MultipleSequences in
the
Workflow Database
pane. (Note: the name of the MultipleSequences might be different).
To use it,
"drag and
drop" the new MultipleSequences object into the
Workflow
area (
see below).
Step 3. Add the MSA
applications
Go into the
Workflow Tools Pane
and using the same "drag and drop" motion, add the
ClustalW and
Muscle
application to the
Workflow
area.
Note: Executing this workflow locally can be SLOW.
If you have Internet access, you can use the ClustalW2 (Web EBI) and
Muscle (Web EBI) applications instead, which result in a running time
of ~7 minutes for this HIV dataset.
To speed up the analysis, open the
Muscle configuration box by
Double-Clicking
with the
left mouse
button on the Muscle application box and select in the "Fast setting" the "
Fastest possible (nucleotides)".
Double-Click
with the
left mouse
button on the Red dot
(
identifying the application data outputs) to show all application outputs.
Note: The
blue color for
any input or output data indicates that either it is undefined or has not
been generated. Once the data are genetated, the output will
turn to the
green color, as shown below.
Connect the
MultipleSequences
(examples\hiv.fasta) to the applications box by selecting
the
connection box
at the end of the MultipleSequences object and link it to each of
the multiple sequence alignment applications (i.e. ClustalW2 and
Muscle).
Step 4. Execute the workflow
In the far right corner of
the worflow artea, click on the
Run button.
While the applications are running you can see the application
progress either by looking at the workflow progress bar (above the workflow area) or by
clicking on the
Output
panel (see below).
You can
save this generated application output to a text file by
clicking on the bottom-left button
"Save as Text".
When all the workflow executions have been carried out, a
should appear on the right-corner of each application box. If an error has occured, a warning sign
will be displayed.
Step 5. Display the results
To view a
Text View of the multiple sequence alignment, either
double-click on the application
aligment output or
right-click on it and select
"View".
In this view (see below), you can select either a
simplified representation of the aligned sequences, or a
fasta or phylip format for rapid "
Cut-and-Paste" to other applications.
To display a more graphic representations, right-click on the aligment output and select
"View Graphic".