First importing all the libraries:

#import all libraries import networkx as nx import warnings import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline warnings.filterwarnings('ignore')

### Reading Data

#Reading data market=pd.DataFrame.from_csv("nasdaq.csv") market.shape

**Introduction**

Negative shock spread in stock markets during the financial crisis. To model the stock market using network analysis, different stocks are represented as different nodes. However, it is not straightforward to define the connections between nodes.

A traditional way to create edges is to look at the correlation of some defined attributes over a selected time frame. If the correlation is larger or lower (negative) than some threshold, the edges exit, like what we discussed in the section of the importance of nodes.

DataFrame "market" includes daily returns of 2000 stocks of Nasdaq from 2013-2018.

Pair of stocks have a connection if the absolute value of their correlation is high enough. We first calculate the correlation matrix of 2000 stocks and get the histogram of correlation

**Problem 1**:

Use the Louvain method to find clusters of newmarket. Take 4 largest clusters, check if they are correlated with financial measures? for example, overall return, mean of daily return, volatility, yearly Sharpe ratio (Suppose risk-free rate is 0%), etc? In other words, if there exist clusters that take high values in one of these measures, or very low value in one of these measures? If this is true, it will definitely help us select stocks

**Problem 2**

Take the third largest cluster you obtained in problem 1, named as G2.

Can you compute the distance (unweighted) of nodes of G2.

Do multidimensional scaling (MDS) of this cluster into 2 dimensional space and use kmeans method to cluster G2 into 4 communities.

Plot G2 using coordinates obtained in MDS. Nodes from Different community has different colors.

**Problem 3**
How is the performance of network model in identifying good stocks for investment
How to define edges are extremely important to find pattern between clusters and stock performance.
Could you provide one definition of links in network of stocks and explain why you think network with such kind of links can help identify good stocks or good pattern for investment in stock market?

**Answer:**
With higher dimensions of coordinates, this method is more likely group nodes connected with shortest path.
**If edge betweenness is higher, the link is weak**.

in the above graph cluster 1 ,edges are close to each othere which belongs to same group so these are good stocks pattern