|Developing a high-level autonomous collision avoidance system for ships that can operate in an unstructured and unpredictable environment is challenging. Particularly in congested sea areas, each ship should make decisions continuously to avoid collisions with other ships in a busy and complex waterway. Furthermore, recent reports indicate that a large number of marine collision accidents are caused by or are related to human decision failures concerning a lack of situational awareness and failure to comply with the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs). In this study, we propose an efficient method to overcome multiship collision avoidance problems based on the Deep Reinforcement Learning (DRL) algorithm by expanding our previous study (Zhao et al., 2019). The proposed method directly maps the states of encountered ships to an ownship's steering commands in terms of rudder angle using the Deep Neural Network (DNN). This DNN is trained over multiple ships in rich encountering situations using the policy-gradient based DRL algorithm. To address multiple encountered ships, we classify them into four regions based on COLREGs, and consider only the nearest ship in each region. We validate the proposed collision avoidance method in a variety of simulated scenarios with thorough performance evaluations, and demonstrate that the final DRL controller can obtain time efficient and collision-free paths for multiple ships. Simulation results indicate that multiple ships can avoid collisions with each other while following their own predefined paths simultaneously. In addition, the proposed approach demonstrates its excellent adaptability to unknown complex environments with various encountered ships.